15 Oct, 2007

40 commits

  • rename all 'cnt' fields and variables to the less yucky 'count' name.

    yuckage noticed by Andrew Morton.

    no change in code, other than the /proc/sched_debug bkl_count string got
    a bit larger:

    text data bss dec hex filename
    38236 3506 24 41766 a326 sched.o.before
    38240 3506 24 41770 a32a sched.o.after

    Signed-off-by: Ingo Molnar
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • fix yield bugs due to the current-not-in-rbtree changes: the task is
    not in the rbtree so rbtree-removal is a no-no.

    [ From: Srivatsa Vaddagiri : build fix. ]

    also, nice code size reduction:

    kernel/sched.o:
    text data bss dec hex filename
    38323 3506 24 41853 a37d sched.o.before
    38236 3506 24 41766 a326 sched.o.after

    Signed-off-by: Ingo Molnar
    Signed-off-by: Dmitry Adamushko
    Reviewed-by: Thomas Gleixner

    Dmitry Adamushko
     
  • group scheduler wakeup latency fix: when checking for preemption
    we must check cross-group too, not just intra-group.

    Signed-off-by: Ingo Molnar

    Srivatsa Vaddagiri
     
  • Lee Schermerhorn noticed that set_leftmost() contains dead code,
    remove this.

    Reported-by: Lee Schermerhorn
    Signed-off-by: Ingo Molnar
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • The adjusting sched_class is a missing part of the already existing "do
    not leak PI boosting priority to the child" at the sched_fork(). This
    patch moves the adjusting sched_class from wake_up_new_task() to
    sched_fork().

    this also shrinks the code a bit:

    text data bss dec hex filename
    40111 4018 292 44421 ad85 sched.o.before
    40102 4018 292 44412 ad7c sched.o.after

    Signed-off-by: Hiroshi Shimamoto
    Signed-off-by: Dmitry Adamushko
    Signed-off-by: Ingo Molnar
    Reviewed-by: Thomas Gleixner

    Hiroshi Shimamoto
     
  • max_vruntime() simplification.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Peter Zijlstra
     
  • fix sched_fork(): large latencies at new task creation time because
    the ->vruntime was not fixed up cross-CPU, if the parent got migrated
    after the child's CPU got set up.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • fix sign check error in place_entity() - we'd get excessive
    latencies due to negatives being converted to large u64's.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Ingo Molnar
     
  • undo some of the recent changes that are not needed after all,
    such as last_min_vruntime.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Ingo Molnar
     
  • remove last_min_vruntime use - prepare to remove it.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Ingo Molnar
     
  • remove condition from set_task_cpu(). Now that ->vruntime
    is not global anymore, it should (and does) work fine without
    it too.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Ingo Molnar
     
  • entity_key() fix - we'd occasionally end up with a 0 vruntime
    in the !initial case.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Ingo Molnar
     
  • debug feature: check how well we schedule within a reasonable
    vruntime 'spread' range. (note that CPU overload can increase
    the spread, so this is not a hard condition, but normal loads
    should be within the spread.)

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Peter Zijlstra
     
  • more width for parameter printouts in /proc/sched_debug.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • add vslice: the load-dependent "virtual slice" a task should
    run ideally, so that the observed latency stays within the
    sched_latency window.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Peter Zijlstra
     
  • print the current value of all tunables in /proc/sched_debug output.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • remove unneeded tunables.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • build fix for the SCHED_DEBUG && !SCHEDSTATS case.

    Signed-off-by: S.Ceglar Onur
    Signed-off-by: Ingo Molnar
    Reviewed-by: Thomas Gleixner

    S.Caglar Onur
     
  • add per task and per rq BKL usage statistics.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • enable CONFIG_FAIR_GROUP_SCHED=y by default.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • fair-group sched, cleanups.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • Enable user-id based fair group scheduling. This is useful for anyone
    who wants to test the group scheduler w/o having to enable
    CONFIG_CGROUPS.

    A separate scheduling group (i.e struct task_grp) is automatically created for
    every new user added to the system. Upon uid change for a task, it is made to
    move to the corresponding scheduling group.

    A /proc tunable (/proc/root_user_share) is also provided to tune root
    user's quota of cpu bandwidth.

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • With the view of supporting user-id based fair scheduling (and not just
    container-based fair scheduling), this patch renames several functions
    and makes them independent of whether they are being used for container
    or user-id based fair scheduling.

    Also fix a problem reported by KAMEZAWA Hiroyuki (wrt allocating
    less-sized array for tg->cfs_rq[] and tf->se[]).

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • - Print &rq->cfs statistics as well (useful for group scheduling)

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • - print nr_running and load information for cfs_rq in /proc/sched_debug

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • - fix a minor bug in yield (seen for CONFIG_FAIR_GROUP_SCHED),
    group scheduling would skew when yield was called.

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • Revert removal of set_curr_task.
    Use put_prev_task/set_curr_task when changing groups/policies

    Signed-off-by: Srivatsa Vaddagiri < vatsa@linux.vnet.ibm.com>
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Srivatsa Vaddagiri
     
  • some trivial whitespace cleanups.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • fix formatting of /proc/sched_debug

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Mike Galbraith
     
  • enhance debug output by changing 12345678 nsecs to 12.345678 output,
    this is more human-readable.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • print the correct amount of dashes in /proc/sched_debug.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • rework enqueue/dequeue_entity() to get rid of
    sched_class::set_curr_task(). This simplifies sched_setscheduler(),
    rt_mutex_setprio() and sched_move_tasks().

    text data bss dec hex filename
    24330 2734 20 27084 69cc sched.o.before
    24233 2730 20 26983 6967 sched.o.after

    Signed-off-by: Dmitry Adamushko
    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Dmitry Adamushko
     
  • the 'p' (task_struct) parameter in the sched_class :: yield_task() is
    redundant as the caller is always the 'current'. Get rid of it.

    text data bss dec hex filename
    24341 2734 20 27095 69d7 sched.o.before
    24330 2734 20 27084 69cc sched.o.after

    Signed-off-by: Dmitry Adamushko
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Dmitry Adamushko
     
  • due to the fact that we no longer keep the 'current' within the tree,
    dequeue/enqueue_entity() is useless for the 'current' in
    task_new_fair(). We are about to reschedule and
    sched_class->put_prev_task() will put the 'current' back into the tree,
    based on its new key.

    text data bss dec hex filename
    24388 2734 20 27142 6a06 sched.o.before
    24341 2734 20 27095 69d7 sched.o.after

    Signed-off-by: Dmitry Adamushko
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Dmitry Adamushko
     
  • fix delay accounting performance regression - those sched_clock()
    calls are not needed.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • Get rid of 'sched_entity::fair_key'.

    As a side effect, 'current' is not kept withing the tree for
    SCHED_NORMAL/BATCH tasks anymore. This simplifies some parts of code
    (e.g. entity_tick() and yield_task_fair()) and also somewhat optimizes
    them (e.g. a single update_curr() now vs. dequeue/enqueue() before in
    entity_tick()).

    Signed-off-by: Dmitry Adamushko
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Dmitry Adamushko
     
  • p->sched_class->set_curr_task() has to be called before
    activate_task()/enqueue_task() in rt_mutex_setprio(),
    sched_setschedule() and sched_move_task() in order to set up
    'cfs_rq->curr'. The logic of enqueueing depends on whether a task to be
    inserted is 'current' or not.

    Signed-off-by: Dmitry Adamushko
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Dmitry Adamushko
     
  • Fix a problem in the 'sched-group' patch for !CONFIG_FAIR_GROUP_SCHED.

    description:

    sched_setscheduler()
    {
    ...
    if (task_running()) p->sched_class->put_prev_entity();

    [ this one sets up cfs_rq->curr to NULL ]

    ...

    if (task_running) p->sched_class->set_curr_task();

    [ and this one is a _NOP_ (empty) for !CONFIG_FAIR_GROUP_SCHED ]

    As a result, the task continues to run with cfs_rq->curr == NULL... no
    crashes (due to checks for !NULL in place) but e.g. update_curr()
    effectively becomes a NOP... i.e. runtime statistics for this task is
    not accounted untill it's rescheduled anew.

    Signed-off-by: Dmitry Adamushko
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Dmitry Adamushko
     
  • Add interface to control cpu bandwidth allocation to task-groups.

    (not yet configurable, due to missing CONFIG_CONTAINERS)

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Srivatsa Vaddagiri
     
  • fix SMP migration latencies: the vruntimes of different CPUs are
    at incompatible offsets so they have to be fixed up when migrating
    a task across CPUs.

    Signed-off-by: Mike Galbraith
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Mike Galbraith