04 Dec, 2019

1 commit


01 Dec, 2019

1 commit

  • When a classful qdisc's child qdisc has set the flag
    TCQ_F_CPUSTATS (pfifo_fast for example), the child qdisc's
    cpu_bstats should be passed to gnet_stats_copy_basic(),
    but many classful qdisc didn't do that. As a result,
    `tc -s class show dev DEV` always return 0 for bytes and
    packets in this case.

    Pass the child qdisc's cpu_bstats to gnet_stats_copy_basic()
    to fix this issue.

    The qstats also has this problem, but it has been fixed
    in 5dd431b6b9 ("net: sched: introduce and use qstats read...")
    and bstats still remains buggy.

    Fixes: 22e0f8b9322c ("net: sched: make bstats per cpu and estimator RCU safe")
    Signed-off-by: Dust Li
    Signed-off-by: Tony Lu
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Dust Li
     

19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

02 Apr, 2019

1 commit

  • Classful qdiscs can't access directly the child qdiscs backlog
    length: if such qdisc is NOLOCK, per CPU values should be
    accounted instead.

    Most qdiscs no not respect the above. As a result, qstats fetching
    for most classful qdisc is currently incorrect: if the child qdisc is
    NOLOCK, it always reports 0 len backlog.

    This change introduces a pair of helpers to safely fetch
    both backlog and qlen and use them in stats class dumping
    functions, fixing the above issue and cleaning a bit the code.

    DRR needs also to access the child qdisc queue length, so it
    needs custom handling.

    Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array")
    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

15 Nov, 2018

1 commit

  • Drivers offloading Qdiscs should have reasonable certainty
    the offloaded behaviour matches the SW path. This is impossible
    if the driver does not know about all Qdiscs or when Qdiscs move
    and are reused. Send a graft notification from MQ.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: John Hurley
    Signed-off-by: David S. Miller

    Jakub Kicinski
     

09 Nov, 2018

1 commit


26 Sep, 2018

1 commit

  • Current implementation of qdisc_destroy() decrements Qdisc reference
    counter and only actually destroy Qdisc if reference counter value reached
    zero. Rename qdisc_destroy() to qdisc_put() in order for it to better
    describe the way in which this function currently implemented and used.

    Extract code that deallocates Qdisc into new private qdisc_destroy()
    function. It is intended to be shared between regular qdisc_put() and its
    unlocked version that is introduced in next patch in this series.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     

29 May, 2018

2 commits

  • MQ doesn't hold any statistics on its own, however, statistic
    from offloads are requested starting from the root, hence MQ
    will read the old values for its sums. Call into the drivers,
    because of the additive nature of the stats drivers are aware
    of how much "pending updates" they have to children of the MQ.
    Since MQ reset its stats on every dump we can simply offset
    the stats, predicting how stats of offloaded children will
    change.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • mq offload is trivial, we just need to let the device know
    that the root qdisc is mq. Alternative approach would be
    to export qdisc_lookup() and make drivers check the root
    type themselves, but notification via ndo_setup_tc is more
    in line with other qdiscs.

    Note that mq doesn't hold any stats on it's own, it just
    adds up stats of its children.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jakub Kicinski
     

22 Dec, 2017

3 commits

  • This patch adds extack support for the function qdisc_create_dflt which is
    a common used function in the tc subsystem. Callers which are interested
    in the receiving error can assign extack to get a more detailed
    information why qdisc_create_dflt failed. The function qdisc_create_dflt
    will also call an init callback which can fail by any per-qdisc specific
    handling.

    Cc: David Ahern
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Alexander Aring
    Signed-off-by: David S. Miller

    Alexander Aring
     
  • This patch adds extack support for graft callback to prepare per-qdisc
    specific changes for extack.

    Cc: David Ahern
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Alexander Aring
    Signed-off-by: David S. Miller

    Alexander Aring
     
  • This patch adds extack support for init callback to prepare per-qdisc
    specific changes for extack.

    Cc: David Ahern
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Alexander Aring
    Signed-off-by: David S. Miller

    Alexander Aring
     

09 Dec, 2017

2 commits

  • The sch_mqprio qdisc creates a sub-qdisc per tx queue which are then
    called independently for enqueue and dequeue operations. However
    statistics are aggregated and pushed up to the "master" qdisc.

    This patch adds support for any of the sub-qdiscs to be per cpu
    statistic qdiscs. To handle this case add a check when calculating
    stats and aggregate the per cpu stats if needed.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • The sch_mq qdisc creates a sub-qdisc per tx queue which are then
    called independently for enqueue and dequeue operations. However
    statistics are aggregated and pushed up to the "master" qdisc.

    This patch adds support for any of the sub-qdiscs to be per cpu
    statistic qdiscs. To handle this case add a check when calculating
    stats and aggregate the per cpu stats if needed.

    Also exports __gnet_stats_copy_queue() to use as a helper function.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     

28 Oct, 2017

1 commit

  • Currently, the class_ops select_queue() implementation on sch_mq
    returns a pointer to netdev_queue #0 when it receives and invalid
    qdisc id. That can be misleading since all of mq's inner qdiscs are
    attached to a valid netdev_queue.

    Here we fix that by returning NULL when a qdisc id is invalid. This is
    aligned with how select_queue() is implemented for sch_mqprio in the
    next patch on this series, keeping a consistent behavior between these
    two qdiscs.

    Signed-off-by: Jesus Sanchez-Palencia
    Tested-by: Henrik Austad
    Signed-off-by: Jeff Kirsher

    Jesus Sanchez-Palencia
     

26 Aug, 2017

1 commit

  • For TC classes, their ->get() and ->put() are always paired, and the
    reference counting is completely useless, because:

    1) For class modification and dumping paths, we already hold RTNL lock,
    so all of these ->get(),->change(),->put() are atomic.

    2) For filter bindiing/unbinding, we use other reference counter than
    this one, and they should have RTNL lock too.

    3) For ->qlen_notify(), it is special because it is called on ->enqueue()
    path, but we already hold qdisc tree lock there, and we hold this
    tree lock when graft or delete the class too, so it should not be gone
    or changed until we release the tree lock.

    Therefore, this patch removes ->get() and ->put(), but:

    1) Adds a new ->find() to find the pointer to a class by classid, no
    refcnt.

    2) Move the original class destroy upon the last refcnt into ->delete(),
    right after releasing tree lock. This is fine because the class is
    already removed from hash when holding the lock.

    For those who also use ->put() as ->unbind(), just rename them to reflect
    this change.

    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Acked-by: Jiri Pirko
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     

13 Mar, 2017

1 commit

  • The original reason [1] for having hidden qdiscs (potential scalability
    issues in qdisc_match_from_root() with single linked list in case of large
    amount of qdiscs) has been invalidated by 59cc1f61f0 ("net: sched: convert
    qdisc linked list to hashtable").

    This allows us for bringing more clarity and determinism into the dump by
    making default pfifo qdiscs visible.

    We're not turning this on by default though, at it was deemed [2] too
    intrusive / unnecessary change of default behavior towards userspace.
    Instead, TCA_DUMP_INVISIBLE netlink attribute is introduced, which allows
    applications to request complete qdisc hierarchy dump, including the
    ones that have always been implicit/invisible.

    Singleton noop_qdisc stays invisible, as teaching the whole infrastructure
    about singletons would require quite some surgery with very little gain
    (seeing no qdisc or seeing noop qdisc in the dump is probably setting
    the same user expectation).

    [1] http://lkml.kernel.org/r/1460732328.10638.74.camel@edumazet-glaptop3.roam.corp.google.com
    [2] http://lkml.kernel.org/r/20161021.105935.1907696543877061916.davem@davemloft.net

    Signed-off-by: Jiri Kosina
    Signed-off-by: David S. Miller

    Jiri Kosina
     

12 Feb, 2017

1 commit

  • Dmitry reported uses after free in qdisc code [1]

    The problem here is that ops->init() can return an error.

    qdisc_create_dflt() then call ops->destroy(),
    while qdisc_create() does _not_ call it.

    Four qdisc chose to call their own ops->destroy(), assuming their caller
    would not.

    This patch makes sure qdisc_create() calls ops->destroy()
    and fixes the four qdisc to avoid double free.

    [1]
    BUG: KASAN: use-after-free in mq_destroy+0x242/0x290 net/sched/sch_mq.c:33 at addr ffff8801d415d440
    Read of size 8 by task syz-executor2/5030
    CPU: 0 PID: 5030 Comm: syz-executor2 Not tainted 4.3.5-smp-DEV #119
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    0000000000000046 ffff8801b435b870 ffffffff81bbbed4 ffff8801db000400
    ffff8801d415d440 ffff8801d415dc40 ffff8801c4988510 ffff8801b435b898
    ffffffff816682b1 ffff8801b435b928 ffff8801d415d440 ffff8801c49880c0
    Call Trace:
    [] __dump_stack lib/dump_stack.c:15 [inline]
    [] dump_stack+0x6c/0x98 lib/dump_stack.c:51
    [] kasan_object_err+0x21/0x70 mm/kasan/report.c:158
    [] print_address_description mm/kasan/report.c:196 [inline]
    [] kasan_report_error+0x1b4/0x4b0 mm/kasan/report.c:285
    [] kasan_report mm/kasan/report.c:305 [inline]
    [] __asan_report_load8_noabort+0x43/0x50 mm/kasan/report.c:326
    [] mq_destroy+0x242/0x290 net/sched/sch_mq.c:33
    [] qdisc_destroy+0x12d/0x290 net/sched/sch_generic.c:953
    [] qdisc_create_dflt+0xf0/0x120 net/sched/sch_generic.c:848
    [] attach_default_qdiscs net/sched/sch_generic.c:1029 [inline]
    [] dev_activate+0x6ad/0x880 net/sched/sch_generic.c:1064
    [] __dev_open+0x221/0x320 net/core/dev.c:1403
    [] __dev_change_flags+0x15e/0x3e0 net/core/dev.c:6858
    [] dev_change_flags+0x8e/0x140 net/core/dev.c:6926
    [] dev_ifsioc+0x446/0x890 net/core/dev_ioctl.c:260
    [] dev_ioctl+0x1ba/0xb80 net/core/dev_ioctl.c:546
    [] sock_do_ioctl+0x99/0xb0 net/socket.c:879
    [] sock_ioctl+0x2a0/0x390 net/socket.c:958
    [] vfs_ioctl fs/ioctl.c:44 [inline]
    [] do_vfs_ioctl+0x8a8/0xe50 fs/ioctl.c:611
    [] SYSC_ioctl fs/ioctl.c:626 [inline]
    [] SyS_ioctl+0x94/0xc0 fs/ioctl.c:617
    [] entry_SYSCALL_64_fastpath+0x12/0x17

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

11 Aug, 2016

1 commit

  • Convert the per-device linked list into a hashtable. The primary
    motivation for this change is that currently, we're not tracking all the
    qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
    performed over the linked list by qdisc_match_from_root() is rather
    expensive.

    The ultimate goal is to get rid of hidden qdiscs completely, which will
    bring much more determinism in user experience.

    Reviewed-by: Cong Wang
    Signed-off-by: Jiri Kosina
    Signed-off-by: David S. Miller

    Jiri Kosina
     

08 Jun, 2016

1 commit

  • Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
    agent [1] are problematic at scale :

    For each qdisc/class found in the dump, we currently lock the root qdisc
    spinlock in order to get stats. Sampling stats every 5 seconds from
    thousands of HTB classes is a challenge when the root qdisc spinlock is
    under high pressure. Not only the dumps take time, they also slow
    down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.

    An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
    that might need the qdisc lock in fq_codel_dump_stats() and
    fq_codel_dump_class_stats()

    In v2 of this patch, I now use the Qdisc running seqcount to provide
    consistent reads of packets/bytes counters, regardless of 32/64 bit arches.

    I also changed rate estimators to use the same infrastructure
    so that they no longer need to lock root qdisc lock.

    [1]
    http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdf

    Signed-off-by: Eric Dumazet
    Cc: Cong Wang
    Cc: Jamal Hadi Salim
    Cc: John Fastabend
    Cc: Kevin Athey
    Cc: Xiaotian Pei
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 Mar, 2016

1 commit


04 Dec, 2015

1 commit

  • qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
    devices.

    One problem is that it updates sch->q.qlen and sch->qstats.drops
    on the mq/mqprio root qdisc, while it should not : Daniele
    reported underflows errors :
    [ 681.774821] PAX: sch->q.qlen: 0 n: 1
    [ 681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head;
    [ 681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G O 4.2.6.201511282239-1-grsec #1
    [ 681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015
    [ 681.774956] ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c
    [ 681.774959] ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b
    [ 681.774960] ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001
    [ 681.774962] Call Trace:
    [ 681.774967] [] dump_stack+0x4c/0x7f
    [ 681.774970] [] report_size_overflow+0x34/0x50
    [ 681.774972] [] qdisc_tree_decrease_qlen+0x152/0x160
    [ 681.774976] [] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel]
    [ 681.774978] [] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel]
    [ 681.774980] [] __qdisc_run+0x4d/0x1d0
    [ 681.774983] [] net_tx_action+0xc2/0x160
    [ 681.774985] [] __do_softirq+0xf1/0x200
    [ 681.774987] [] run_ksoftirqd+0x1e/0x30
    [ 681.774989] [] smpboot_thread_fn+0x150/0x260
    [ 681.774991] [] ? sort_range+0x40/0x40
    [ 681.774992] [] kthread+0xe4/0x100
    [ 681.774994] [] ? kthread_worker_fn+0x170/0x170
    [ 681.774995] [] ret_from_fork+0x3e/0x70

    mq/mqprio have their own ways to report qlen/drops by folding stats on
    all their queues, with appropriate locking.

    A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
    without proper locking : concurrent qdisc updates could corrupt the list
    that qdisc_match_from_root() parses to find a qdisc given its handle.

    Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
    qdisc_tree_decrease_qlen() can use to abort its tree traversal,
    as soon as it meets a mq/mqprio qdisc children.

    Second problem can be fixed by RCU protection.
    Qdisc are already freed after RCU grace period, so qdisc_list_add() and
    qdisc_list_del() simply have to use appropriate rcu list variants.

    A future patch will add a per struct netdev_queue list anchor, so that
    qdisc_tree_decrease_qlen() can have more efficient lookups.

    Reported-by: Daniele Fucini
    Signed-off-by: Eric Dumazet
    Cc: Cong Wang
    Cc: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Sep, 2014

3 commits

  • After previous patches to simplify qstats the qstats can be
    made per cpu with a packed union in Qdisc struct.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • This removes the use of qstats->qlen variable from the classifiers
    and makes it an explicit argument to gnet_stats_copy_queue().

    The qlen represents the qdisc queue length and is packed into
    the qstats at the last moment before passnig to user space. By
    handling it explicitely we avoid, in the percpu stats case, having
    to figure out which per_cpu variable to put it in.

    It would probably be best to remove it from qstats completely
    but qstats is a user space ABI and can't be broken. A future
    patch could make an internal only qstats structure that would
    avoid having to allocate an additional u32 variable on the
    Qdisc struct. This would make the qstats struct 128bits instead
    of 128+32.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • In order to run qdisc's without locking statistics and estimators
    need to be handled correctly.

    To resolve bstats make the statistics per cpu. And because this is
    only needed for qdiscs that are running without locks which is not
    the case for most qdiscs in the near future only create percpu
    stats when qdiscs set the TCQ_F_CPUSTATS flag.

    Next because estimators use the bstats to calculate packets per
    second and bytes per second the estimator code paths are updated
    to use the per cpu statistics.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     

10 Dec, 2013

1 commit

  • Commit 6da7c8fcbcbd ("qdisc: allow setting default queuing discipline")
    added the ability to change default qdisc from pfifo_fast to say fq

    But as most modern ethernet devices are multiqueue, we cant really
    see all the statistics from "tc -s qdisc show", as the default root
    qdisc is mq.

    This patch adds the calls to qdisc_list_add() to mq and mqprio

    Signed-off-by: Eric Dumazet
    Cc: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     

31 Aug, 2013

1 commit

  • By default, the pfifo_fast queue discipline has been used by default
    for all devices. But we have better choices now.

    This patch allow setting the default queueing discipline with sysctl.
    This allows easy use of better queueing disciplines on all devices
    without having to use tc qdisc scripts. It is intended to allow
    an easy path for distributions to make fq_codel or sfq the default
    qdisc.

    This patch also makes pfifo_fast more of a first class qdisc, since
    it is now possible to manually override the default and explicitly
    use pfifo_fast. The behavior for systems who do not use the sysctl
    is unchanged, they still get pfifo_fast

    Also removes leftover random # in sysctl net core.

    Signed-off-by: Stephen Hemminger
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    stephen hemminger
     

12 Dec, 2012

1 commit

  • With BQL being deployed, we can more likely have following behavior :

    We dequeue a packet from qdisc in dequeue_skb(), then we realize target
    tx queue is in XOFF state in sch_direct_xmit(), and we have to hold the
    skb into gso_skb for later.

    This shows in stats (tc -s qdisc dev eth0) as requeues.

    Problem of these requeues is that high priority packets can not be
    dequeued as long as this (possibly low prio and big TSO packet) is not
    removed from gso_skb.

    At 1Gbps speed, a full size TSO packet is 500 us of extra latency.

    In some cases, we know that all packets dequeued from a qdisc are
    for a particular and known txq :

    - If device is non multi queue
    - For all MQ/MQPRIO slave qdiscs

    This patch introduces a new qdisc flag, TCQ_F_ONETXQUEUE to mark
    this capability, so that dequeue_skb() is allowed to dequeue a packet
    only if the associated txq is not stopped.

    This indeed reduce latencies for high prio packets (or improve fairness
    with sfq/fq_codel), and almost remove qdisc 'requeues'.

    Signed-off-by: Eric Dumazet
    Cc: Jamal Hadi Salim
    Cc: John Fastabend
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Nov, 2011

1 commit


22 Jan, 2011

1 commit

  • Now qdisc stab is handled before TCQ_F_CAN_BYPASS test in
    __dev_xmit_skb(), we can generalize TCQ_F_CAN_BYPASS to other qdiscs
    than pfifo_fast : pfifo, bfifo, pfifo_head_drop and sfq

    SFQ is special because it can have external classifiers, and in these
    cases, we cannot bypass queue discipline (packet could be dropped by
    classifier) without admin asking it, or further changes.

    Its worth doing this, especially for SFQ, avoiding dirtying memory in
    case no packets are already waiting in queue.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Oct, 2010

1 commit


18 May, 2010

1 commit

  • This patch removes from net/ (but not any netfilter files)
    all the unnecessary return; statements that precede the
    last closing brace of void functions.

    It does not remove the returns that are immediately
    preceded by a label as gcc doesn't like that.

    Done via:
    $ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
    xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

18 Sep, 2009

1 commit


15 Sep, 2009

1 commit

  • After the recent mq change there is the new select_queue qdisc class
    method used in tc_modify_qdisc, but it works OK only for direct child
    qdiscs of mq qdisc. Grandchildren always get the first tx queue, which
    would give wrong qdisc_root etc. results (e.g. for sch_htb as child of
    sch_prio). This patch fixes it by using parent's dev_queue for such
    grandchildren qdiscs. The select_queue method's return type is changed
    BTW.

    With feedback from: Patrick McHardy

    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

10 Sep, 2009

1 commit

  • When new child qdiscs are attached to the mq qdisc, they are actually
    attached as root qdiscs to the device queues. The lock selection for
    new estimators incorrectly picks the root lock of the existing and
    to be replaced qdisc, which results in a use-after-free once the old
    qdisc has been destroyed.

    Mark mq qdisc instances with a new flag and treat qdiscs attached to
    mq as children similar to regular root qdiscs.

    Additionally prevent estimators from being attached to the mq qdisc
    itself since it only updates its byte and packet counters during dumps.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

06 Sep, 2009

1 commit

  • This patch adds a classful dummy scheduler which can be used as root qdisc
    for multiqueue devices and exposes each device queue as a child class.

    This allows to address queues individually and graft them similar to regular
    classes. Additionally it presents an accumulated view of the statistics of
    all real root qdiscs in the dummy root.

    Two new callbacks are added to the qdisc_ops and qdisc_class_ops:

    - cl_ops->select_queue selects the tx queue number for new child classes.

    - qdisc_ops->attach() overrides root qdisc device grafting to attach
    non-shared qdiscs to the queues.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    David S. Miller