12 Jul, 2013

2 commits

  • This patch removes the forward declaration of qfq_update_agg_ts, by moving
    the definition of the function above its first call. This patch also
    removes a useless forward declaration of qfq_schedule_agg.

    Reported-by: David S. Miller
    Signed-off-by: Paolo Valente
    Signed-off-by: David S. Miller

    Paolo Valente
     
  • In make_eligible, a mask is used to decide which groups must become eligible:
    the i-th group becomes eligible only if the i-th bit of the mask (from the
    right) is set. The mask is computed by left-shifting a 1 by a given number of
    places, and decrementing the result. The shift is performed on a ULL to avoid
    problems in case the number of places to shift is higher than 31. On a 32-bit
    machine, this is more costly than working on an UL. This patch replaces such a
    costly operation with two cheaper branches.

    The trick is based on the following fact: in case of a shift of at least 32
    places, the resulting mask has at least the 32 less significant bits set,
    whereas the total number of groups is lower than 32. As a consequence, in this
    case it is enough to just set the 32 less significant bits of the mask with a
    cheaper ~0UL. In the other case, the shift can be safely performed on a UL.

    Reported-by: David S. Miller
    Reported-by: David Laight
    Signed-off-by: Paolo Valente
    Signed-off-by: David S. Miller

    Paolo Valente
     

04 Jul, 2013

1 commit

  • commit aec0a40a6f7884 ("netem: use rb tree to implement the time queue")
    added a regression if a child qdisc is attached to netem, as we perform
    a NULL dereference.

    Fix this by adding a temporary variable to cache
    netem_skb_cb(skb)->time_to_send.

    Reported-by: Dan Carpenter
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Jul, 2013

1 commit

  • Following typical setup to implement a ~100 ms RTT and big
    amount of reorders has very poor performance because netem
    implements the time queue using a linked list.
    -----------------------------------------------------------
    ETH=eth0
    IFB=ifb0
    modprobe ifb
    ip link set dev $IFB up
    tc qdisc add dev $ETH ingress 2>/dev/null
    tc filter add dev $ETH parent ffff: \
    protocol ip u32 match u32 0 0 flowid 1:1 action mirred egress \
    redirect dev $IFB
    ethtool -K $ETH gro off tso off gso off
    tc qdisc add dev $IFB root netem delay 50ms 10ms limit 100000
    tc qd add dev $ETH root netem delay 50ms limit 100000
    ---------------------------------------------------------

    Switch netem time queue to a rb tree, so this kind of setup can work at
    high speed.

    Signed-off-by: Eric Dumazet
    Cc: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Jun, 2013

2 commits

  • htb_sched structures are big, and source of false sharing on SMP.

    Every time a packet is queued or dequeue, many cache lines must be
    touched because structures are not lay out properly.

    By carefully splitting htb_sched in two parts, and define sub structures
    to increase data locality, we can improve performance dramatically on
    SMP.

    New htb_prio structure can also be used in htb_class to increase data
    locality.

    I got 26 % performance increase on a 24 threads machine, with 200
    concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.

    Signed-off-by: Eric Dumazet
    Cc: Tom Herbert
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Conflicts:
    drivers/net/wireless/ath/ath9k/Kconfig
    drivers/net/xen-netback/netback.c
    net/batman-adv/bat_iv_ogm.c
    net/wireless/nl80211.c

    The ath9k Kconfig conflict was a change of a Kconfig option name right
    next to the deletion of another option.

    The xen-netback conflict was overlapping changes involving the
    handling of the notify list in xen_netbk_rx_action().

    Batman conflict resolution provided by Antonio Quartulli, basically
    keep everything in both conflict hunks.

    The nl80211 conflict is a little more involved. In 'net' we added a
    dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
    Linus reported. Meanwhile in 'net-next' the handlers were converted
    to use pre and post doit handlers which use a flag to determine
    whether to hold the RTNL mutex around the operation.

    However, the dump handlers to not use this logic. Instead they have
    to explicitly do the locking. There were apparent bugs in the
    conversion of nl80211_dump_wiphy() in that we were not dropping the
    RTNL mutex in all the return paths, and it seems we very much should
    be doing so. So I fixed that whilst handling the overlapping changes.

    To simplify the initial returns, I take the RTNL mutex after we try
    to allocate 'tb'.

    Signed-off-by: David S. Miller

    David S. Miller
     

14 Jun, 2013

1 commit

  • htb_class structures are big, and source of false sharing on SMP.

    By carefully splitting them in two parts, we can improve performance.

    I got 9 % performance increase on a 24 threads machine, with 200
    concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.

    Signed-off-by: Eric Dumazet
    Cc: Tom Herbert
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 Jun, 2013

2 commits

  • With a thousand htb classes, est_timer() spends ~5 million cpu cycles
    and throws out cpu cache, because each htb class has a default
    rate estimator (est 4sec 16sec).

    Most users do not use default rate estimators, so switch htb
    to not setup ones.

    Add a module parameter (htb_rate_est) so that users relying
    on this default rate estimator can revert the behavior.

    echo 1 >/sys/module/sch_htb/parameters/htb_rate_est

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Before allowing 64bits bytes rates, refactor
    psched_ratecfg_precompute() to get better comments
    and increased accuracy.

    rate_bps field is renamed to rate_bytes_ps, as we only
    have to worry about bytes per second.

    Signed-off-by: Eric Dumazet
    Cc: Ben Greear
    Signed-off-by: David S. Miller

    Eric Dumazet
     

11 Jun, 2013

1 commit

  • struct gnet_stats_rate_est contains u32 fields, so the bytes per second
    field can wrap at 34360Mbit.

    Add a new gnet_stats_rate_est64 structure to get 64bit bps/pps fields,
    and switch the kernel to use this structure natively.

    This structure is dumped to user space as a new attribute :

    TCA_STATS_RATE_EST64

    Old tc command will now display the capped bps (to 34360Mbit), instead
    of wrapped values, and updated tc command will display correct
    information.

    Old tc command output, after patch :

    eric:~# tc -s -d qd sh dev lo
    qdisc pfifo 8001: root refcnt 2 limit 1000p
    Sent 80868245400 bytes 1978837 pkt (dropped 0, overlimits 0 requeues 0)
    rate 34360Mbit 189696pps backlog 0b 0p requeues 0

    This patch carefully reorganizes "struct Qdisc" layout to get optimal
    performance on SMP.

    Signed-off-by: Eric Dumazet
    Cc: Ben Hutchings
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Jun, 2013

1 commit


06 Jun, 2013

1 commit

  • Merge 'net' bug fixes into 'net-next' as we have patches
    that will build on top of them.

    This merge commit includes a change from Emil Goode
    (emilgoode@gmail.com) that fixes a warning that would
    have been introduced by this merge. Specifically it
    fixes the pingv6_ops method ipv6_chk_addr() to add a
    "const" to the "struct net_device *dev" argument and
    likewise update the dummy_ipv6_chk_addr() declaration.

    Signed-off-by: David S. Miller

    David S. Miller
     

05 Jun, 2013

1 commit

  • commit 56b765b79 ("htb: improved accuracy at high rates") added another
    regression for low rates, because it mixes 1ns and 64ns time units.

    So the maximum delay (mbuffer) was not 60 second, but 937 ms.

    Lets convert all time fields to 1ns as 64bit arches are becoming the
    norm.

    Reported-by: Jesper Dangaard Brouer
    Signed-off-by: Eric Dumazet
    Tested-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Eric Dumazet
     

03 Jun, 2013

1 commit

  • commit 56b765b79 ("htb: improved accuracy at high rates")
    broke the "overhead xxx" handling, as well as the "linklayer atm"
    attribute.

    tc class add ... htb rate X ceil Y linklayer atm overhead 10

    This patch restores the "overhead xxx" handling, for htb, tbf
    and act_police

    The "linklayer atm" thing needs a separate fix.

    Reported-by: Jesper Dangaard Brouer
    Signed-off-by: Eric Dumazet
    Cc: Vimalkumar
    Cc: Jiri Pirko
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 May, 2013

1 commit

  • So far, only net_device * could be passed along with netdevice notifier
    event. This patch provides a possibility to pass custom structure
    able to provide info that event listener needs to know.

    Signed-off-by: Jiri Pirko

    v2->v3: fix typo on simeth
    shortened dev_getter
    shortened notifier_info struct name
    v1->v2: fix notifier_call parameter in call_netdevice_notifier()
    Signed-off-by: David S. Miller

    Jiri Pirko
     

23 May, 2013

1 commit

  • If a GSO packet has a length above tbf burst limit, the packet
    is currently silently dropped.

    Current way to handle this is to set the device in non GSO/TSO mode, or
    setting high bursts, and its sub optimal.

    We can actually segment too big GSO packets, and send individual
    segments as tbf parameters allow, allowing for better interoperability.

    Signed-off-by: Eric Dumazet
    Cc: Ben Hutchings
    Cc: Jiri Pirko
    Cc: Jamal Hadi Salim
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 May, 2013

2 commits

  • Pull networking updates from David Miller:
    "Highlights (1721 non-merge commits, this has to be a record of some
    sort):

    1) Add 'random' mode to team driver, from Jiri Pirko and Eric
    Dumazet.

    2) Make it so that any driver that supports configuration of multiple
    MAC addresses can provide the forwarding database add and del
    calls by providing a default implementation and hooking that up if
    the driver doesn't have an explicit set of handlers. From Vlad
    Yasevich.

    3) Support GSO segmentation over tunnels and other encapsulating
    devices such as VXLAN, from Pravin B Shelar.

    4) Support L2 GRE tunnels in the flow dissector, from Michael Dalton.

    5) Implement Tail Loss Probe (TLP) detection in TCP, from Nandita
    Dukkipati.

    6) In the PHY layer, allow supporting wake-on-lan in situations where
    the PHY registers have to be written for it to be configured.

    Use it to support wake-on-lan in mv643xx_eth.

    From Michael Stapelberg.

    7) Significantly improve firewire IPV6 support, from YOSHIFUJI
    Hideaki.

    8) Allow multiple packets to be sent in a single transmission using
    network coding in batman-adv, from Martin Hundebøll.

    9) Add support for T5 cxgb4 chips, from Santosh Rastapur.

    10) Generalize the VXLAN forwarding tables so that there is more
    flexibility in configurating various aspects of the endpoints.
    From David Stevens.

    11) Support RSS and TSO in hardware over GRE tunnels in bxn2x driver,
    from Dmitry Kravkov.

    12) Zero copy support in nfnelink_queue, from Eric Dumazet and Pablo
    Neira Ayuso.

    13) Start adding networking selftests.

    14) In situations of overload on the same AF_PACKET fanout socket, or
    per-cpu packet receive queue, minimize drop by distributing the
    load to other cpus/fanouts. From Willem de Bruijn and Eric
    Dumazet.

    15) Add support for new payload offset BPF instruction, from Daniel
    Borkmann.

    16) Convert several drivers over to mdoule_platform_driver(), from
    Sachin Kamat.

    17) Provide a minimal BPF JIT image disassembler userspace tool, from
    Daniel Borkmann.

    18) Rewrite F-RTO implementation in TCP to match the final
    specification of it in RFC4138 and RFC5682. From Yuchung Cheng.

    19) Provide netlink socket diag of netlink sockets ("Yo dawg, I hear
    you like netlink, so I implemented netlink dumping of netlink
    sockets.") From Andrey Vagin.

    20) Remove ugly passing of rtnetlink attributes into rtnl_doit
    functions, from Thomas Graf.

    21) Allow userspace to be able to see if a configuration change occurs
    in the middle of an address or device list dump, from Nicolas
    Dichtel.

    22) Support RFC3168 ECN protection for ipv6 fragments, from Hannes
    Frederic Sowa.

    23) Increase accuracy of packet length used by packet scheduler, from
    Jason Wang.

    24) Beginning set of changes to make ipv4/ipv6 fragment handling more
    scalable and less susceptible to overload and locking contention,
    from Jesper Dangaard Brouer.

    25) Get rid of using non-type-safe NLMSG_* macros and use nlmsg_*()
    instead. From Hong Zhiguo.

    26) Optimize route usage in IPVS by avoiding reference counting where
    possible, from Julian Anastasov.

    27) Convert IPVS schedulers to RCU, also from Julian Anastasov.

    28) Support cpu fanouts in xt_NFQUEUE netfilter target, from Holger
    Eitzenberger.

    29) Network namespace support for nf_log, ebt_log, xt_LOG, ipt_ULOG,
    nfnetlink_log, and nfnetlink_queue. From Gao feng.

    30) Implement RFC3168 ECN protection, from Hannes Frederic Sowa.

    31) Support several new r8169 chips, from Hayes Wang.

    32) Support tokenized interface identifiers in ipv6, from Daniel
    Borkmann.

    33) Use usbnet_link_change() helper in USB net driver, from Ming Lei.

    34) Add 802.1ad vlan offload support, from Patrick McHardy.

    35) Support mmap() based netlink communication, also from Patrick
    McHardy.

    36) Support HW timestamping in mlx4 driver, from Amir Vadai.

    37) Rationalize AF_PACKET packet timestamping when transmitting, from
    Willem de Bruijn and Daniel Borkmann.

    38) Bring parity to what's provided by /proc/net/packet socket dumping
    and the info provided by netlink socket dumping of AF_PACKET
    sockets. From Nicolas Dichtel.

    39) Fix peeking beyond zero sized SKBs in AF_UNIX, from Benjamin
    Poirier"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
    filter: fix va_list build error
    af_unix: fix a fatal race with bit fields
    bnx2x: Prevent memory leak when cnic is absent
    bnx2x: correct reading of speed capabilities
    net: sctp: attribute printl with __printf for gcc fmt checks
    netlink: kconfig: move mmap i/o into netlink kconfig
    netpoll: convert mutex into a semaphore
    netlink: Fix skb ref counting.
    net_sched: act_ipt forward compat with xtables
    mlx4_en: fix a build error on 32bit arches
    Revert "bnx2x: allow nvram test to run when device is down"
    bridge: avoid OOPS if root port not found
    drivers: net: cpsw: fix kernel warn on cpsw irq enable
    sh_eth: use random MAC address if no valid one supplied
    3c509.c: call SET_NETDEV_DEV for all device types (ISA/ISAPnP/EISA)
    tg3: fix to append hardware time stamping flags
    unix/stream: fix peeking with an offset larger than data in queue
    unix/dgram: fix peeking with an offset larger than data in queue
    unix/dgram: peek beyond 0-sized skbs
    openvswitch: Remove unneeded ovs_netdev_get_ifindex()
    ...

    Linus Torvalds
     
  • Deal with changes in newer xtables while maintaining backward
    compatibility. Thanks to Jan Engelhardt for suggestions.

    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Jamal Hadi Salim
     

30 Apr, 2013

2 commits


23 Apr, 2013

1 commit

  • Conflicts:
    drivers/net/ethernet/emulex/benet/be_main.c
    drivers/net/ethernet/intel/igb/igb_main.c
    drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
    include/net/scm.h
    net/batman-adv/routing.c
    net/ipv4/tcp_input.c

    The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
    cleanup in net-next to now pass cred structs around.

    The be2net driver had a bug fix in 'net' that overlapped with the VLAN
    interface changes by Patrick McHardy in net-next.

    An IGB conflict existed because in 'net' the build_skb() support was
    reverted, and in 'net-next' there was a comment style fix within that
    code.

    Several batman-adv conflicts were resolved by making sure that all
    calls to batadv_is_my_mac() are changed to have a new bat_priv first
    argument.

    Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
    rewrite in 'net-next', mostly overlapping changes.

    Thanks to Stephen Rothwell and Antonio Quartulli for help with several
    of these merge resolutions.

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Apr, 2013

2 commits


13 Apr, 2013

1 commit


08 Apr, 2013

1 commit


03 Apr, 2013

2 commits

  • Pull net into net-next to get the synchronize_net() bug fix in
    bonding.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • currently cbq works incorrectly for limits > 10% real link bandwidth,
    and practically does not work for limits > 50% real link bandwidth.
    Below are results of experiments taken on 1 Gbit link

    In shaper | Actual Result
    -----------+---------------
    100M | 108 Mbps
    200M | 244 Mbps
    300M | 412 Mbps
    500M | 893 Mbps

    This happen because of q->now changes incorrectly in cbq_dequeue():
    when it is called before real end of packet transmitting,
    L2T is greater than real time delay, q_now gets an extra boost
    but never compensate it.

    To fix this problem we prevent change of q->now until its synchronization
    with real time.

    Signed-off-by: Vasily Averin
    Reviewed-by: Alexey Kuznetsov
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Vasily Averin
     

02 Apr, 2013

1 commit


30 Mar, 2013

1 commit


29 Mar, 2013

1 commit


28 Mar, 2013

1 commit

  • It seems that commit

    commit 292f1c7ff6cc10516076ceeea45ed11833bb71c7
    Author: Jiri Pirko
    Date: Tue Feb 12 00:12:03 2013 +0000

    sch: make htb_rate_cfg and functions around that generic

    adds little regression.

    Before:

    # tc qdisc add dev eth0 root handle 1: htb default ffff
    # tc class add dev eth0 classid 1:ffff htb rate 5Gbit
    # tc -s class show dev eth0
    class htb 1:ffff root prio 0 rate 5000Mbit ceil 5000Mbit burst 625b cburst
    625b
    Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
    rate 0bit 0pps backlog 0b 0p requeues 0
    lended: 0 borrowed: 0 giants: 0
    tokens: 31 ctokens: 31

    After:

    # tc qdisc add dev eth0 root handle 1: htb default ffff
    # tc class add dev eth0 classid 1:ffff htb rate 5Gbit
    # tc -s class show dev eth0
    class htb 1:ffff root prio 0 rate 1544Mbit ceil 1544Mbit burst 625b cburst
    625b
    Sent 5073 bytes 41 pkt (dropped 0, overlimits 0 requeues 0)
    rate 1976bit 2pps backlog 0b 0p requeues 0
    lended: 41 borrowed: 0 giants: 0
    tokens: 1802 ctokens: 1802

    This probably due to lost u64 cast of rate parameter in
    psched_ratecfg_precompute() (net/sched/sch_generic.c).

    Signed-off-by: Sergey Popovich
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Sergey Popovich
     

27 Mar, 2013

1 commit

  • When the legacy array rtm_min still exists, the length check within
    these functions is covered by rtm_min[RTM_NEWTFILTER],
    rtm_min[RTM_NEWQDISC] and rtm_min[RTM_NEWTCLASS].

    But after Thomas Graf removed rtm_min several days ago, these checks
    are missing. Other doit functions should be OK.

    Signed-off-by: Hong Zhiguo
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Hong zhi guo
     

22 Mar, 2013

1 commit


12 Mar, 2013

1 commit


07 Mar, 2013

1 commit

  • HTB uses an internal pfifo queue, which limit is not reported
    to userland tools (tc), and value inherited from device tx_queue_len
    at setup time.

    Introduce TCA_HTB_DIRECT_QLEN attribute to allow finer control.

    Remove two obsolete pr_err() calls as well.

    Signed-off-by: Eric Dumazet
    Cc: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Eric Dumazet
     

06 Mar, 2013

5 commits

  • QFQ+ can select for service only 'eligible' aggregates, i.e.,
    aggregates that would have started to be served also in the emulated
    ideal system. As a consequence, for QFQ+ to be work conserving, at
    least one of the active aggregates must be eligible when it is time to
    choose the next aggregate to serve.

    The set of eligible aggregates is updated through the function
    qfq_update_eligible(), which does guarantee that, after its
    invocation, at least one of the active aggregates is eligible.
    Because of this property, this function is invoked in
    qfq_deactivate_agg() to guarantee that at least one of the active
    aggregates is still eligible after an aggregate has been deactivated.
    In particular, the critical case is when there are other active
    aggregates, but the aggregate being deactivated happens to be the only
    one eligible.

    However, this precaution is not needed for QFQ+ to be work conserving,
    because update_eligible() is always invoked also at the beginning of
    qfq_choose_next_agg(). This patch removes the additional invocation of
    update_eligible() in qfq_deactivate_agg().

    Signed-off-by: Paolo Valente
    Reviewed-by: Fabio Checconi
    Signed-off-by: David S. Miller

    Paolo Valente
     
  • By definition of (the algorithm of) QFQ+, the system virtual time must
    be pushed up only if there is no 'eligible' aggregate, i.e. no
    aggregate that would have started to be served also in the ideal
    system emulated by QFQ+. QFQ+ serves only eligible aggregates, hence
    the aggregate currently in service is eligible. As a consequence, to
    decide whether there is no eligible aggregate, QFQ+ must also check
    whether there is no aggregate in service.

    Signed-off-by: Paolo Valente
    Reviewed-by: Fabio Checconi
    Signed-off-by: David S. Miller

    Paolo Valente
     
  • Aggregate budgets are computed so as to guarantee that, after an
    aggregate has been selected for service, that aggregate has enough
    budget to serve at least one maximum-size packet for the classes it
    contains. For this reason, after a new aggregate has been selected
    for service, its next packet is immediately dequeued, without any
    further control.

    The maximum packet size for a class, lmax, can be changed through
    qfq_change_class(). In case the user sets lmax to a lower value than
    the the size of some of the still-to-arrive packets, QFQ+ will
    automatically push up lmax as it enqueues these packets. This
    automatic push up is likely to happen with TSO/GSO.

    In any case, if lmax is assigned a lower value than the size of some
    of the packets already enqueued for the class, then the following
    problem may occur: the size of the next packet to dequeue for the
    class may happen to be larger than lmax, after the aggregate to which
    the class belongs has been just selected for service. In this case,
    even the budget of the aggregate, which is an unsigned value, may be
    lower than the size of the next packet to dequeue. After dequeueing
    this packet and subtracting its size from the budget, the latter would
    wrap around.

    This fix prevents the budget from wrapping around after any packet
    dequeue.

    Signed-off-by: Paolo Valente
    Reviewed-by: Fabio Checconi
    Signed-off-by: David S. Miller

    Paolo Valente
     
  • If no aggregate is in service, then the function qfq_dequeue() does
    not dequeue any packet. For this reason, to guarantee QFQ+ to be work
    conserving, a just-activated aggregate must be set as in service
    immediately if it happens to be the only active aggregate.
    This is done by the function qfq_enqueue().

    Unfortunately, the function qfq_add_to_agg(), used to add a class to
    an aggregate, does not perform this important additional operation.
    In particular, if: 1) qfq_add_to_agg() is invoked to complete the move
    of a class from a source aggregate, becoming, for this move, inactive,
    to a destination aggregate, becoming instead active, and 2) the
    destination aggregate becomes the only active aggregate, then this
    aggregate is not however set as in service. QFQ+ remains then in a
    non-work-conserving state until a new invocation of qfq_enqueue()
    recovers the situation.

    This fix solves the problem by moving the logic for setting an
    aggregate as in service directly into the function qfq_activate_agg().
    Hence, from whatever point qfq_activate_aggregate() is invoked, QFQ+
    remains work conserving. Since the more-complex logic of this new
    version of activate_aggregate() is not necessary, in qfq_dequeue(), to
    reschedule an aggregate that finishes its budget, then the aggregate
    is now rescheduled by invoking directly the functions needed.

    Signed-off-by: Paolo Valente
    Reviewed-by: Fabio Checconi
    Signed-off-by: David S. Miller

    Paolo Valente
     
  • Between two invocations of make_eligible, the system virtual time may
    happen to grow enough that, in its binary representation, a bit with
    higher order than 31 flips. This happens especially with
    TSO/GSO. Before this fix, the mask used in make_eligible was computed
    as (1UL< 31.
    The fix just replaces 1UL with 1ULL.

    Signed-off-by: Paolo Valente
    Reviewed-by: Fabio Checconi
    Signed-off-by: David S. Miller

    Paolo Valente