17 Jul, 2010

1 commit

  • As of commit dcce284 ("mm: Extend gfp masking to the page
    allocator") and commit 7e85ee0 ("slab,slub: don't enable
    interrupts during early boot"), the slab allocator makes
    sure we don't attempt to sleep during boot.

    Therefore, remove bootmem special cases from the scheduler
    and use plain GFP_KERNEL instead.

    Signed-off-by: Pekka Enberg
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Pekka Enberg
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

08 Mar, 2010

1 commit


07 Mar, 2010

1 commit

  • Rename for_each_bit to for_each_set_bit in the kernel source tree. To
    permit for_each_clear_bit(), should that ever be added.

    The patch includes a macro to map the old for_each_bit() onto the new
    for_each_set_bit(). This is a (very) temporary thing to ease the migration.

    [akpm@linux-foundation.org: add temporary for_each_bit()]
    Suggested-by: Alexey Dobriyan
    Suggested-by: Andrew Morton
    Signed-off-by: Akinobu Mita
    Cc: "David S. Miller"
    Cc: Russell King
    Cc: David Woodhouse
    Cc: Artem Bityutskiy
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

04 Feb, 2010

1 commit


02 Feb, 2010

1 commit


15 Dec, 2009

1 commit


02 Aug, 2009

2 commits

  • We need to add the new prio to the cpupri accounting before
    removing the old prio. This is because removing the old prio
    first will open a race window where the cpu will be removed
    from pri_active. In this case the cpu will not be visible for
    RT push and pulls. This could cause a RT task to not migrate
    appropriately, and create a very large latency.

    This bug was found with the use of ftrace sched events and
    trace_printk.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Background:

    Several race conditions in the scheduler have cropped up
    recently, which Steven and I have tracked down using ftrace.
    The most recent one turns out to be a race in how the scheduler
    determines a suitable migration target for RT tasks, introduced
    recently with commit:

    commit 68e74568fbe5854952355e942acca51f138096d9
    Date: Tue Nov 25 02:35:13 2008 +1030

    sched: convert struct cpupri_vec cpumask_var_t.

    The original design of cpupri allowed lockless readers to
    quickly determine a best-estimate target. Races between the
    pri_active bitmap and the vec->mask were handled in the
    original code because we would detect and return "0" when this
    occured. The design was predicated on the *effective*
    atomicity (*) of caching the result of cpus_and() between the
    cpus_allowed and the vec->mask.

    Commit 68e74568 changed the behavior such that vec->mask is
    accessed multiple times. This introduces a subtle race, the
    result of which means we can have a result that returns "1",
    but with an empty bitmap.

    *) yes, we know cpus_and() is not a locked operator across the
    entire composite array, but it is implicitly atomic on a
    per-word basis which is all the design required to work.

    Implementation:

    Rather than forgoing the lockless design, or reverting to a
    stack-based cpumask_t, we simply check for when the race has
    been encountered and continue processing in the event that the
    race is hit. This renders the removal race as if the priority
    bit had been atomically cleared as well, and allows the
    algorithm to execute correctly.

    Signed-off-by: Gregory Haskins
    CC: Rusty Russell
    CC: Steven Rostedt
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Gregory Haskins
     

17 Jun, 2009

1 commit

  • Those two functions no longer call alloc_bootmmem_cpumask_var(),
    so no need to tag them with __init_refok.

    Signed-off-by: Li Zefan
    Acked-by: Pekka Enberg
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Li Zefan
     

12 Jun, 2009

1 commit


09 Jun, 2009

1 commit


01 Apr, 2009

1 commit


06 Jan, 2009

1 commit


25 Nov, 2008

1 commit

  • Impact: stack usage reduction, (future) size reduction for large NR_CPUS.

    Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
    space for small nr_cpu_ids but big CONFIG_NR_CPUS.

    The fact cpupro_init is called both before and after the slab is
    available makes for an ugly parameter unfortunately.

    We also use cpumask_any_and to get rid of a temporary in cpupri_find.

    Signed-off-by: Rusty Russell
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

06 Jun, 2008

1 commit

  • The current code use a linear algorithm which causes scaling issues
    on larger SMP machines. This patch replaces that algorithm with a
    2-dimensional bitmap to reduce latencies in the wake-up path.

    Signed-off-by: Gregory Haskins
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Gregory Haskins