09 Aug, 2007

13 commits

  • There are two problems with balance_tasks() and how it used:

    1. The variables best_prio and best_prio_seen (inherited from the old
    move_tasks()) were only required to handle problems caused by the
    active/expired arrays, the order in which they were processed and the
    possibility that the task with the highest priority could be on either.
    These issues are no longer present and the extra overhead associated
    with their use is unnecessary (and possibly wrong).

    2. In the absence of CONFIG_FAIR_GROUP_SCHED being set, the same
    this_best_prio variable needs to be used by all scheduling classes or
    there is a risk of moving too much load. E.g. if the highest priority
    task on this at the beginning is a fairly low priority task and the rt
    class migrates a task (during its turn) then that moved task becomes the
    new highest priority task on this_rq but when the sched_fair class
    initializes its copy of this_best_prio it will get the priority of the
    original highest priority task as, due to the run queue locks being
    held, the reschedule triggered by pull_task() will not have taken place.
    This could result in inappropriate overriding of skip_for_load and
    excessive load being moved.

    The attached patch addresses these problems by deleting all reference to
    best_prio and best_prio_seen and making this_best_prio a reference
    parameter to the various functions involved.

    load_balance_fair() has also been modified so that this_best_prio is
    only reset (in the loop) if CONFIG_FAIR_GROUP_SCHED is set. This should
    preserve the effect of helping spread groups' higher priority tasks
    around the available CPUs while improving system performance when
    CONFIG_FAIR_GROUP_SCHED isn't set.

    Signed-off-by: Peter Williams
    Signed-off-by: Ingo Molnar

    Peter Williams
     
  • Document the design thinking behind nice levels.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • kernel.sched_domain hierarchy is under CTL_UNNUMBERED and thus
    unreachable to sysctl(2). Generating .ctl_number's in such situation is
    not useful.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Ingo Molnar

    Alexey Dobriyan
     
  • small delta_exec accounting fix: increase delta_exec and increase
    sum_exec_runtime even if the task is not on the runqueue anymore.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • cleanup: delta_mine is an unsigned value.

    no code impact:

    text data bss dec hex filename
    27823 2726 16 30565 7765 sched.o.before
    27823 2726 16 30565 7765 sched.o.after

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • speed up schedule(): share the 'now' parameter that deactivate_task()
    was calculating internally.

    ( this also fixes the small accounting window between the deactivate
    call and the pick_next_task() call. )

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • uninline rq_clock() to save 263 bytes of code:

    text data bss dec hex filename
    39561 3642 24 43227 a8db sched.o.before
    39298 3642 24 42964 a7d4 sched.o.after

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • sched_fair.c defines print_cfs_stats, and sched_debug.c uses it, but sched.c
    includes both sched_fair.c and sched_debug.c, so all the references to
    print_cfs_stats occur in the same compilation unit. Thus, mark
    print_cfs_stats static.

    Eliminates a sparse warning:
    warning: symbol 'print_cfs_stats' was not declared. Should it be static?

    Signed-off-by: Josh Triplett
    Signed-off-by: Ingo Molnar

    Josh Triplett
     
  • here's another tiny cleanup. The generated code is not affected (gcc is
    smart enough) but for people looking over the code it is just irritating
    to have the extra conditional.

    Signed-off-by: Ulrich Drepper
    Signed-off-by: Ingo Molnar

    Ulrich Drepper
     
  • a little hint to switch on CONFIG_SCHED_DEBUG should be given.

    Signed-off-by: Ingo Molnar

    Thomas Voegtle
     
  • The move_tasks() function is currently multiplexed with two distinct
    capabilities:

    1. attempt to move a specified amount of weighted load from one run
    queue to another; and
    2. attempt to move a specified number of tasks from one run queue to
    another.

    The first of these capabilities is used in two places, load_balance()
    and load_balance_idle(), and in both of these cases the return value of
    move_tasks() is used purely to decide if tasks/load were moved and no
    notice of the actual number of tasks moved is taken.

    The second capability is used in exactly one place,
    active_load_balance(), to attempt to move exactly one task and, as
    before, the return value is only used as an indicator of success or failure.

    This multiplexing of sched_task() was introduced, by me, as part of the
    smpnice patches and was motivated by the fact that the alternative, one
    function to move specified load and one to move a single task, would
    have led to two functions of roughly the same complexity as the old
    move_tasks() (or the new balance_tasks()). However, the new modular
    design of the new CFS scheduler allows a simpler solution to be adopted
    and this patch addresses that solution by:

    1. adding a new function, move_one_task(), to be used by
    active_load_balance(); and
    2. making move_tasks() a single purpose function that tries to move a
    specified weighted load and returns 1 for success and 0 for failure.

    One of the consequences of these changes is that neither move_one_task()
    or the new move_tasks() care how many tasks sched_class.load_balance()
    moves and this enables its interface to be simplified by returning the
    amount of load moved as its result and removing the load_moved pointer
    from the argument list. This helps simplify the new move_tasks() and
    slightly reduces the amount of work done in each of
    sched_class.load_balance()'s implementations.

    Further simplification, e.g. changes to balance_tasks(), are possible
    but (slightly) complicated by the special needs of load_balance_fair()
    so I've left them to a later patch (if this one gets accepted).

    NB Since move_tasks() gets called with two run queue locks held even
    small reductions in overhead are worthwhile.

    [ mingo@elte.hu ]

    this change also reduces code size nicely:

    text data bss dec hex filename
    39216 3618 24 42858 a76a sched.o.before
    39173 3618 24 42815 a73f sched.o.after

    Signed-off-by: Peter Williams
    Signed-off-by: Ingo Molnar

    Peter Williams
     
  • Peter Williams suggested to flip the order of update_cpu_load(rq) with
    the ->task_tick() call. This is a NOP for the current scheduler (the
    two functions are independent of each other), ->task_tick() might
    create some state for update_cpu_load() in the future (or in PlugSched).

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • batch up the sleeper bonus sum a bit more. Anything below
    sched-granularity is too small to make a practical difference
    anyway.

    this optimization reduces the math in high-frequency scheduling
    scenarios.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

07 Aug, 2007

9 commits

  • * master.kernel.org:/home/rmk/linux-2.6-arm:
    [ARM] rpc: update defconfig
    [ARM] pata_icside: fix the FIXMEs
    [ARM] 4542/1: AT91: include atmel_lcdc.h in at91sam926{1,3}_devices.c
    [ARM] 4541/1: iop: defconfig updates
    [ARM] 4531/1: remove is_in_rom() protptype

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    [CRYPTO] api: fix writting into unallocated memory in setkey_aligned

    Linus Torvalds
     
  • C99 6.10.3[11]: preprocessing directive within the argument list of
    macro invocation => undefined behaviour. Don't do that...

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • Lguest drivers need to default to "Y" otherwise they're never selected
    for new builds. (We don't bother prompting, because they're less than
    4k combined, and implied by selecting lguest support).

    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Rusty Russell
     
  • More fallout from the writeback fixes: debug register transfer
    instructions do their own writeback and thus need to disable the general
    writeback mechanism.

    This fixes oopses and some guest failures on AMD machines (the Intel
    variant decodes the instruction in hardware and thus does not need
    emulation).

    Cc: Alistair John Strachan
    Signed-off-by: Avi Kivity
    Signed-off-by: Linus Torvalds

    Avi Kivity
     
  • * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
    [NETFILTER]: Add xt_statistic.h to the header list for usermode programs
    [BNX2]: Fix suspend/resume problem.
    [TG3]: Fix suspend/resume problem.

    Linus Torvalds
     
  • * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
    [SPARC32]: Fix build.

    Linus Torvalds
     
  • * master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (32 commits)
    [SCSI] aacraid: prevent panic on adapter resource failure
    [SCSI] aha152x: use data accessors and !use_sg cleanup
    [SCSI] aha152x: Fix check_condition code-path
    [SCSI] aha152x: Clean Reset path
    [SCSI] aha152x: preliminary fixes and some comments
    [SCSI] aha152x: use bounce buffer
    [SCSI] aha152x: fix debug mode symbol conflict
    [SCSI] sd: disentangle barriers in SCSI
    [SCSI] lpfc : scsi command accessor fix for 8.2.2
    [SCSI] qlogicpti: Some cosmetic changes
    [SCSI] lpfc 8.2.2 : Change version number to 8.2.2
    [SCSI] lpfc 8.2.2 : Style cleanups
    [SCSI] lpfc 8.2.2 : Miscellaneous Bug Fixes
    [SCSI] lpfc 8.2.2 : Miscellaneous management and logging mods
    [SCSI] lpfc 8.2.2 : Rework the lpfc_printf_log() macro
    [SCSI] lpfc 8.2.2 : Attribute and Parameter splits for vport and physical port
    [SCSI] lpfc 8.2.2 : Fix locking around HBA's port_list
    [SCSI] lpfc 8.2.2 : Error messages and debugfs updates
    [SCSI] initialize shost_data to zero
    [SCSI] mptsas: add SMP passthrough support via bsg
    ...

    Linus Torvalds
     
  • This 965G and above chipsets moved the batch buffer non-secure bits to
    another place. This means that previous drm's allowed in-secure batchbuffers
    to be submitted to the hardware from non-privileged users who are logged
    into X and and have access to direct rendering.

    Signed-off-by: Dave Airlie
    Signed-off-by: Linus Torvalds

    Dave Airlie
     

06 Aug, 2007

3 commits


05 Aug, 2007

5 commits


04 Aug, 2007

10 commits

  • If the driver fails to allocate the contiguous (DMAable) memory for
    system reasons, we fail to load the instance, but then we try to free
    the allocation in the cleanup code and we get a panic in
    pci_free_consistent(). This is reported against an older kernel, hope
    this is relevant for latest/greatest.

    Signed-off-by: Mark Salyzyn
    Signed-off-by: James Bottomley

    Salyzyn, Mark
     
  • And finally this is the regular !use_sg cleanup
    and use of data accessors.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • check_condition code-path was similar but more
    complicated to Reset. It went like this:

    1. extra space was allocated at aha152x_scdata for mirroring
    scsi_cmnd members.
    2. At aha152x_internal_queue() every not check_condition
    (REQUEST_SENSE) command was copied to above members in
    case of error.
    3. At busfree_run() in the DONE_CS phase if a Status of
    SAM_STAT_CHECK_CONDITION was detected. The command was
    re-queued Internally using aha152x_internal_queue(,,check_condition,)
    The old command members are over written with the
    REQUEST_SENSE info.
    4. At busfree_run() in the DONE_CS phase again. If it is a
    check_condition command, info was restored from mirror
    made at first call to aha152x_internal_queue() (see 2)
    and the command is completed.

    What I did is:

    1. Allocate less space in aha152x_scdata only for the 16-byte
    original command. (which is actually not needed by scsi-ml
    anymore at this stage. But this is to much knowledge of scsi-ml)
    2. If Status == SAM_STAT_CHECK_CONDITION, then like before
    re-queue a REQUEST_SENSE command. But only now save original
    command members. (Less of them)
    3. In aha152x_internal_queue(), just like for Reset, use the
    check_condition hint to set differently the working members.
    execute the command.
    4. At busfree_run() in the DONE_CS phase again. restore needed
    members.

    While at it. This patch fixes a BUG. Old code when sending
    a REQUEST_SENSE for a failed command. Would than return with
    cmd->resid == 0 which was the status of the REQUEST_SENSE.
    The failing command resid was lost. And when would resid
    be interesting if not on a failing command?

    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • What Reset code was doing: Save command's important/dangerous
    Info on stack. NULL those members from scsi_cmnd.
    Issue a Reset. wait for it to finish than restore members
    and return.

    What I do is save or NULL nothing. But use the "resetting"
    hint in aha152x_internal_queue() to NULL out working members
    and leave struct scsi_cmnd alone.

    The indent here looks funny but it will change/drop in last
    patch and it is clear this way what changed.

    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • hunk by hunk:
    - CHECK_CONDITION is what happens to cmnd->status >> 1
    or after status_byte() macro. But here it is used
    directly on status which means 0x1 which is an undefined
    bit in the standard. And is a status that will never
    return from a target.

    - in busfree_run at the DONE_SC phase we have 3 distinct
    operation:
    1-if(DONE_SC->SCp.phase & check_condition)
    The REQUEST_SENSE command return.
    - Restore original command
    - Than continue to operation 3.
    2-if(DONE_SC->SCp.Status==SAM_STAT_CHECK_CONDITION)
    A regular command returned with a status.
    - Internally re-Q a REQUEST_SENSE.
    - Do not do operation 3.
    3-
    - Complete the command and return it to scsi-ml
    So the 0x2 in both these operations (1,2) means the scsi
    check-condition status, hence SAM_STAT_CHECK_CONDITION

    - Here the code asks about !(DONE_SC->SCp.Status & not_issued)
    but "not_issued" is an enum belonging to the "phase" member
    and not to the Status returned from target. The reason this
    works is because not_issued==1 and Also CHECK_CONDITION==1
    (remember from hunk 1). So actually the code was asking
    !(DONE_SC->SCp.Status & CHECK_CONDITION). Which means
    "Has the status been read from target yet?"
    Staus is read at status_run(). "not_issued" is
    cleared in seldo_run() which is usually earlier than
    status_run().

    So this patch does nothing as far as assembly is concerned
    but it does let the reader understand what is going on.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • Cause highmem buffers to be bounced to low memory until this
    driver supports highmem addresses. Otherwise it just oopses
    on NULL buffer addresses.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • The symbol conflicts with the rather global one in
    include/linux/locks.h.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Boaz Harrosh
    Signed-off-by: James Bottomley

    Boaz Harrosh
     
  • Our current implementation has a generic set of barrier functions that
    go through the SCSI driver model. Realistically, this is unnecessary,
    because the only device that can use barriers (sd) can set the flush
    functions up at probe or revalidate time. This patch pulls the barrier
    functions out of the mid layer and scsi driver model and relocates them
    directly in sd.

    Acked-by: Tejun Heo
    Signed-off-by: James Bottomley

    James Bottomley
     
  • The device would not resume properly if it was shutdown before the system
    was suspended. In such scenario where the netif_running state is 0,
    bnx2_suspend() would not save the PCI state and so the memory enable bit
    and bus master enable bit would be lost.

    We fix this by always saving and restoring the PCI state in
    bnx2_suspend() and bnx2_resume() regardless of netif_running() state.

    Update version to 1.6.4.

    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Michael Chan
     
  • Joachim Deguara reported that tg3 devices
    would not resume properly if the device was shutdown before the system
    was suspended. In such scenario where the netif_running state is 0,
    tg3_suspend() would not save the PCI state and so the memory enable bit
    and bus master enable bit would be lost.

    We fix this by always saving and restoring the PCI state in
    tg3_suspend() and tg3_resume() regardless of netif_running() state.

    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Michael Chan