15 Jul, 2007

6 commits

  • The NET_CLS_ACT option is now a full replacement for NET_CLS_POLICE,
    remove the old code. The config option will be kept around to select
    the equivalent NET_CLS_ACT options for a short time to allow easier
    upgrades.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The behaviour of NET_CLS_POLICE for TC_POLICE_RECLASSIFY was to return
    it to the qdisc, which could handle it internally or ignore it. With
    NET_CLS_ACT however, tc_classify starts over at the first classifier
    and never returns it to the qdisc. This makes it impossible to support
    qdisc-internal reclassification, which in turn makes it impossible to
    remove the old NET_CLS_POLICE code without breaking compatibility since
    we have two qdiscs (CBQ and ATM) that support this.

    This patch adds a tc_classify_compat function that handles
    reclassification the old way and changes CBQ and ATM to use it.

    This again is of course not fully backwards compatible with the previous
    NET_CLS_ACT behaviour. Unfortunately there is no way to fully maintain
    compatibility *and* support qdisc internal reclassification with
    NET_CLS_ACT, but this seems like the better choice over keeping the two
    incompatible options around forever.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Handle act_api classification results.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Handle act_api classification results.

    The ATM scheduler behaves slightly different than other schedulers
    in that it only handles policer results for successful classifications,
    this behaviour is retained for the act_api case.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • As noticed by Ranko Zivojnovic , calling qdisc_run
    from the timer handler can result in deadlock:

    > CPU#0
    >
    > qdisc_watchdog() fires and gets dev->queue_lock
    > qdisc_run()...qdisc_restart()...
    > -> releases dev->queue_lock and enters dev_hard_start_xmit()
    >
    > CPU#1
    >
    > tc del qdisc dev ...
    > qdisc_graft()...dev_graft_qdisc()...dev_deactivate()...
    > -> grabs dev->queue_lock ...
    >
    > qdisc_reset()...{cbq,hfsc,htb,netem,tbf}_reset()...qdisc_watchdog_cancel()...
    > -> hrtimer_cancel() - waiting for the qdisc_watchdog() to exit, while still
    > holding dev->queue_lock
    >
    > CPU#0
    >
    > dev_hard_start_xmit() returns ...
    > -> wants to get dev->queue_lock(!)
    >
    > DEADLOCK!

    The entire optimization is a bit questionable IMO, it moves potentially
    large parts of NET_TX_SOFTIRQ work to TIMER_SOFTIRQ/HRTIMER_SOFTIRQ,
    which kind of defeats the separation of them.

    Signed-off-by: Patrick McHardy
    Acked-by: Ranko Zivojnovic
    Signed-off-by: David S. Miller

    Patrick McHardy
     

12 Jul, 2007

1 commit


11 Jul, 2007

10 commits

  • Currently the HTB scheduler does not correctly account for TSO packets
    which causes large inaccuracies in the bandwidth control when using TSO.
    This patch allows the HTB scheduler to work with TSO enabled devices.

    Signed-off-by: Ranjit Manomohan
    Signed-off-by: David S. Miller

    Ranjit Manomohan
     
  • Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Use the generic estimator instead of reimplementing (parts of) it.
    For compatibility always create a default estimator for new classes.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Remove stats_lock pointers from qdisc-internal structures, in all cases
    it points to dev->queue_lock. The only case where it is necessary is for
    top-level qdiscs, where it might also point to dev->ingress_lock in case
    of the ingress qdisc. Also remove it from actions completely, it always
    points to the actions internal lock.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The generic estimator is always built in anways and all the config options
    does is prevent including a minimal amount of code for setting it up.
    Additionally the option is already automatically selected for most cases.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Add the new sch_rr qdisc for multiqueue network device support. Allow
    sch_prio and sch_rr to be compiled with or without multiqueue hardware
    support.

    sch_rr is part of sch_prio, and is referenced from MODULE_ALIAS. This
    was done since sch_prio and sch_rr only differ in their dequeue
    routine.

    Signed-off-by: Peter P Waskiewicz Jr
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Peter P Waskiewicz Jr
     
  • Add the multiqueue hardware device support API to the core network
    stack. Allow drivers to allocate multiple queues and manage them at
    the netdev level if they choose to do so.

    Added a new field to sk_buff, namely queue_mapping, for drivers to
    know which tx_ring to select based on OS classification of the flow.

    Signed-off-by: Peter P Waskiewicz Jr
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Peter P Waskiewicz Jr
     
  • Changes :

    - netif_queue_stopped need not be called inside qdisc_restart as
    it has been called already in qdisc_run() before the first skb
    is sent, and in __qdisc_run() after each intermediate skb is
    sent (note : we are the only sender, so the queue cannot get
    stopped while the tx lock was got in the ~LLTX case).

    - BUG_ON((int) q->q.qlen < 0) was a relic from old times when -1
    meant more packets are available, and __qdisc_run used to loop
    when qdisc_restart() returned -1. During those days, it was
    necessary to make sure that qlen is never less than zero, since
    __qdisc_run would get into an infinite loop if no packets are on
    the queue and this bug in qdisc was there (and worse - no more
    skbs could ever get queue'd as we hold the queue lock too). With
    Herbert's recent change to return values, this check is not
    required. Hopefully Herbert can validate this change. If at all
    this is required, it should be added to skb_dequeue (in failure
    case), and not to qdisc_qlen.

    Signed-off-by: Krishna Kumar
    Signed-off-by: David S. Miller

    Krishna Kumar
     
  • New changes :

    - Incorporated Peter Waskiewicz's comments.
    - Re-added back one warning message (on driver returning wrong value).

    Previous changes :

    - Converted to use switch/case code which looks neater.

    - "if (ret == NETDEV_TX_LOCKED && lockless)" is buggy, and the lockless
    check should be removed, since driver will return NETDEV_TX_LOCKED only
    if lockless is true and driver has to do the locking. In the original
    code as well as the latest code, this code can result in a bug where
    if LLTX is not set for a driver (lockless == 0) but the driver is written
    wrongly to do a trylock (despite LLTX being set), the driver returns
    LOCKED. But since lockless is zero, the packet is requeue'd instead of
    calling collision code which will issue warning and free up the skb.
    Instead this skb will be retried with this driver next time, and the same
    result will ensue. Removing this check will catch these driver bugs instead
    of hiding the problem. I am keeping this change to readability section
    since :
    a. it is confusing to check two things as it is; and
    b. it is difficult to keep this check in the changed 'switch' code.

    - Changed some names, like try_get_tx_pkt to dev_dequeue_skb (as that is
    the work being done and easier to understand) and do_dev_requeue to
    dev_requeue_skb, merged handle_dev_cpu_collision and tx_islocked to
    dev_handle_collision (handle_dev_cpu_collision is a small routine with only
    one caller, so there is no need to have two separate routines which also
    results in getting rid of two macros, etc.

    - Removed an XXX comment as it should never fail (I suspect this was related
    to batch skb WIP, Jamal ?). Converted some functions to original coding
    style of having the return values and the function name on same line, eg
    prio2list.

    Signed-off-by: Krishna Kumar
    Signed-off-by: David S. Miller

    Krishna Kumar
     
  • Over the years this code has gotten hairier. Resulting in many long
    discussions over long summer days and patches that get it wrong.
    This patch helps tame that code so normal people will understand it.

    Thanks to Thomas Graf, Peter J. waskiewicz Jr, and Patrick McHardy
    for their valuable reviews.

    Signed-off-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Jamal Hadi Salim
     

08 Jun, 2007

1 commit


04 Jun, 2007

2 commits


25 May, 2007

2 commits


14 May, 2007

1 commit


11 May, 2007

5 commits

  • When we relinquish queue_lock in qdisc_restart and then retake it for
    requeueing, we might race against dev_deactivate and end up requeueing
    onto noop_qdisc. This causes a warning to be printed.

    This patch fixes this by checking this before we requeue. As an added
    bonus, we can remove the same check in __qdisc_run which was added to
    prevent dev->gso_skb from being requeued when we're shutting down.

    Even though we've had to add a new conditional in its place, it's better
    because it only happens on requeues rather than every single time that
    qdisc_run is called.

    For this to work we also need to move the clearing of gso_skb up in
    dev_deactivate as now qdisc_restart can occur even after we wait for
    __LINK_STATE_QDISC_RUNNING to clear (but it won't do anything as long
    as the queue and gso_skb is already clear).

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Now that we return the queue length after NETDEV_TX_OK we better
    make sure that we have the right queue. Otherwise we can cause a
    stall after a really quick dev_deactive/dev_activate.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The current return value scheme and associated comment was invented
    back in the 20th century when we still had that tbusy flag. Things
    have changed quite a bit since then (even Tony Blair is moving on
    now, not to mention the new French president).

    All we need to indicate now is whether the caller should continue
    processing the queue. Therefore it's sufficient if we return 0 if
    we want to stop and non-zero otherwise.

    This is based on a patch by Krishna Kumar.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • When transmit fails with NETDEV_TX_LOCKED the skb is requeued
    to dev->qdisc again. The dev->qdisc pointer is protected by
    the queue lock which needs to be dropped when attempting to
    transmit and acquired again before requeing. The problem is
    that qdisc_restart() fetches the dev->qdisc pointer once and
    stores it in the `q' variable which is invalidated when
    dropping the queue_lock, therefore the variable needs to be
    refreshed before requeueing.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • Optimize teql_enqueue so that it first checks limits before enqueing.

    Signed-off-by: Krishna Kumar
    Signed-off-by: David S. Miller

    Krishna Kumar
     

04 May, 2007

1 commit

  • Cleanup of dev_base list use, with the aim to simplify making device
    list per-namespace. In almost every occasion, use of dev_base variable
    and dev->next pointer could be easily replaced by for_each_netdev
    loop. A few most complicated places were converted to using
    first_netdev()/next_netdev().

    Signed-off-by: Pavel Emelianov
    Acked-by: Kirill Korotaev
    Signed-off-by: David S. Miller

    Pavel Emelianov
     

26 Apr, 2007

11 commits