28 Nov, 2007

1 commit


10 Nov, 2007

5 commits

  • This patch adds a proper prototype for migration_init() in
    include/linux/sched.h

    Since there's no point in always returning 0 to a caller that doesn't check
    the return value it also changes the function to return void.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Adrian Bunk
     
  • SMP balancing is done with IRQs disabled and can iterate the full rq.
    When rqs are large this can cause large irq-latencies. Limit the nr of
    iterations on each run.

    This fixes a scheduling latency regression reported by the -rt folks.

    Signed-off-by: Peter Zijlstra
    Acked-by: Steven Rostedt
    Tested-by: Gregory Haskins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • remove PREEMPT_RESTRICT. (this is a separate commit so that any
    regression related to the removal itself is bisectable)

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
    deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
    broken on powerpc, because we end up counting user time twice: once in
    timer_interrupt() and once in update_process_times().

    This fixes the problem by pulling the code in update_process_times
    that updates utime and stime into a separate function called
    account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
    there is a version of account_process_tick in kernel/timer.c that
    simply accounts a whole tick to either utime or stime as before. If
    CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
    implement account_process_tick.

    This also lets us simplify the s390 code a bit; it means that the s390
    timer interrupt can now call update_process_times even when
    CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
    suitable account_process_tick().

    account_process_tick() now takes the task_struct * as an argument.
    Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     
  • we lost the sched_min_granularity tunable to a clever optimization
    that uses the sched_latency/min_granularity ratio - but the ratio
    is quite unintuitive to users and can also crash the kernel if the
    ratio is set to 0. So reintroduce the min_granularity tunable,
    while keeping the ratio maintained internally.

    no functionality changed.

    [ mingo@elte.hu: some fixlets. ]

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

30 Oct, 2007

2 commits


26 Oct, 2007

1 commit

  • [PATCH] De-constify sched.h

    This reverts commit a8972ccf00b7184a743eb6cd9bc7f3443357910c ("sched:
    constify sched.h")

    1) Patch doesn't change any code here, so gcc is already smart enough
    to "feel" constness in such simple functions.
    2) There is no such thing as const task_struct. Anyone who think
    otherwise deserves compiler warning.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

25 Oct, 2007

3 commits

  • At the moment, a lot of load balancing code that is irrelevant to non
    SMP systems gets included during non SMP builds.

    This patch addresses this issue and reduces the binary size on non
    SMP systems:

    text data bss dec hex filename
    10983 28 1192 12203 2fab sched.o.before
    10739 28 1192 11959 2eb7 sched.o.after

    Signed-off-by: Peter Williams
    Signed-off-by: Ingo Molnar

    Peter Williams
     
  • At the moment, balance_tasks() provides low level functionality for both
    move_tasks() and move_one_task() (indirectly) via the load_balance()
    function (in the sched_class interface) which also provides dual
    functionality. This dual functionality complicates the interfaces and
    internal mechanisms and makes the run time overhead of operations that
    are called with two run queue locks held.

    This patch addresses this issue and reduces the overhead of these
    operations.

    Signed-off-by: Peter Williams
    Signed-off-by: Ingo Molnar

    Peter Williams
     
  • Add const to some struct task_struct * uses

    Signed-off-by: Joe Perches
    Signed-off-by: Ingo Molnar

    Joe Perches
     

20 Oct, 2007

14 commits

  • The pgrp field is not used widely around the kernel so it is now marked as
    deprecated with appropriate comment.

    The initialization of INIT_SIGNALS is trimmed because
    a) they are set to 0 automatically;
    b) gcc cannot properly initialize two anonymous (the second one
    is the one with the session) unions. In this particular case
    to make it compile we'd have to add some field initialized
    right before the .pgrp.

    This is the same patch as the 1ec320afdc9552c92191d5f89fcd1ebe588334ca one
    (from Cedric), but for the pgrp field.

    Some progress report:

    We have to deprecate the pid, tgid, session and pgrp fields on struct
    task_struct and struct signal_struct. The session and pgrp are already
    deprecated. The tgid value is close to being such - the worst known usage
    in in fs/locks.c and audit code. The pid field deprecation is mainly
    blocked by numerous printk-s around the kernel that print the tsk->pid to
    log.

    Signed-off-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: Sukadev Bhattiprolu
    Cc: Cedric Le Goater
    Cc: Serge Hallyn
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • Add a new per-cpuset flag called 'sched_load_balance'.

    When enabled in a cpuset (the default value) it tells the kernel scheduler
    that the scheduler should provide the normal load balancing on the CPUs in
    that cpuset, sometimes moving tasks from one CPU to a second CPU if the
    second CPU is less loaded and if that task is allowed to run there.

    When disabled (write "0" to the file) then it tells the kernel scheduler
    that load balancing is not required for the CPUs in that cpuset.

    Now even if this flag is disabled for some cpuset, the kernel may still
    have to load balance some or all the CPUs in that cpuset, if some
    overlapping cpuset has its sched_load_balance flag enabled.

    If there are some CPUs that are not in any cpuset whose sched_load_balance
    flag is enabled, the kernel scheduler will not load balance tasks to those
    CPUs.

    Moreover the kernel will partition the 'sched domains' (non-overlapping
    sets of CPUs over which load balancing is attempted) into the finest
    granularity partition that it can find, while still keeping any two CPUs
    that are in the same shed_load_balance enabled cpuset in the same element
    of the partition.

    This serves two purposes:
    1) It provides a mechanism for real time isolation of some CPUs, and
    2) it can be used to improve performance on systems with many CPUs
    by supporting configurations in which load balancing is not done
    across all CPUs at once, but rather only done in several smaller
    disjoint sets of CPUs.

    This mechanism replaces the earlier overloading of the per-cpuset
    flag 'cpu_exclusive', which overloading was removed in an earlier
    patch: cpuset-remove-sched-domain-hooks-from-cpusets

    See further the Documentation and comments in the code itself.

    [akpm@linux-foundation.org: don't be weird]
    Signed-off-by: Paul Jackson
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Since these are expanded into call to pid_nr_ns() anyway, it's OK to move
    the whole routine out-of-line. This is a cheap way to save ~100 bytes from
    vmlinux. Together with the previous two patches, it saves half-a-kilo from
    the vmlinux.

    Un-inline other (currently inlined) functions must be done with additional
    performance testing.

    Signed-off-by: Pavel Emelyanov
    Cc: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • With pid namespaces this field is now dangerous to use explicitly, so hide
    it behind the helpers.

    Also the pid and pgrp fields o task_struct and signal_struct are to be
    deprecated. Unfortunately this patch cannot be sent right now as this
    leads to tons of warnings, so start isolating them, and deprecate later.

    Actually the p->tgid == pid has to be changed to has_group_leader_pid(),
    but Oleg pointed out that in case of posix cpu timers this is the same, and
    thread_group_leader() is more preferable.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Oleg Nesterov
    Cc: Sukadev Bhattiprolu
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • The find_task_by_something is a set of macros are used to find task by pid
    depending on what kind of pid is proposed - global or virtual one. All of
    them are wrappers above the most generic one - find_task_by_pid_type_ns() -
    and just substitute some args for it.

    It turned out, that dereferencing the current->nsproxy->pid_ns construction
    and pushing one more argument on the stack inline cause kernel text size to
    grow.

    This patch moves all this stuff out-of-line into kernel/pid.c. Together
    with the next patch it saves a bit less than 400 bytes from the .text
    section.

    Signed-off-by: Pavel Emelyanov
    Cc: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • When clone() is invoked with CLONE_NEWPID, create a new pid namespace and then
    create a new struct pid for the new process. Allocate pid_t's for the new
    process in the new pid namespace and all ancestor pid namespaces. Make the
    newly cloned process the session and process group leader.

    Since the active pid namespace is special and expected to be the first entry
    in pid->upid_list, preserve the order of pid namespaces.

    The size of 'struct pid' is dependent on the the number of pid namespaces the
    process exists in, so we use multiple pid-caches'. Only one pid cache is
    created during system startup and this used by processes that exist only in
    init_pid_ns.

    When a process clones its pid namespace, we create additional pid caches as
    necessary and use the pid cache to allocate 'struct pids' for that depth.

    Note, that with this patch the newly created namespace won't work, since the
    rest of the kernel still uses global pids, but this is to be fixed soon. Init
    pid namespace still works.

    [oleg@tv-sign.ru: merge fix]
    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • * remove pid.h from pid_namespaces.h;
    * rework is_(cgroup|global)_init;
    * optimize (get|put)_pid_ns for init_pid_ns;
    * declare task_child_reaper to return actual reaper.

    Signed-off-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: Sukadev Bhattiprolu
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • When searching the task by numerical id on may need to find it using global
    pid (as it is done now in kernel) or by its virtual id, e.g. when sending a
    signal to a task from one namespace the sender will specify the task's virtual
    id and we should find the task by this value.

    [akpm@linux-foundation.org: fix gfs2 linkage]
    Signed-off-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: Sukadev Bhattiprolu
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • When showing pid to user or getting the pid numerical id for in-kernel use the
    value of this id may differ depending on the namespace.

    This set of helpers is used to get the global pid nr, the virtual (i.e. seen
    by task in its namespace) nr and the nr as it is seen from the specified
    namespace.

    Signed-off-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: Sukadev Bhattiprolu
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • is_init() is an ambiguous name for the pid==1 check. Split it into
    is_global_init() and is_container_init().

    A cgroup init has it's tsk->pid == 1.

    A global init also has it's tsk->pid == 1 and it's active pid namespace
    is the init_pid_ns. But rather than check the active pid namespace,
    compare the task structure with 'init_pid_ns.child_reaper', which is
    initialized during boot to the /sbin/init process and never changes.

    Changelog:

    2.6.22-rc4-mm2-pidns1:
    - Use 'init_pid_ns.child_reaper' to determine if a given task is the
    global init (/sbin/init) process. This would improve performance
    and remove dependence on the task_pid().

    2.6.21-mm2-pidns2:

    - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
    ppc,avr32}/traps.c for the _exception() call to is_global_init().
    This way, we kill only the cgroup if the cgroup's init has a
    bug rather than force a kernel panic.

    [akpm@linux-foundation.org: fix comment]
    [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
    [bunk@stusta.de: kernel/pid.c: remove unused exports]
    [sukadev@us.ibm.com: Fix capability.c to work with threaded init]
    Signed-off-by: Serge E. Hallyn
    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: Pavel Emelianov
    Cc: Eric W. Biederman
    Cc: Cedric Le Goater
    Cc: Dave Hansen
    Cc: Herbert Poetzel
    Cc: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     
  • The set of functions process_session, task_session, process_group and
    task_pgrp is confusing, as the names can be mixed with each other when looking
    at the code for a long time.

    The proposals are to
    * equip the functions that return the integer with _nr suffix to
    represent that fact,
    * and to make all functions work with task (not process) by making
    the common prefix of the same name.

    For monotony the routines signal_session() and set_signal_session() are
    replaced with task_session_nr() and set_task_session(), especially since they
    are only used with the explicit task->signal dereference.

    Signed-off-by: Pavel Emelianov
    Acked-by: Serge E. Hallyn
    Cc: Kirill Korotaev
    Cc: "Eric W. Biederman"
    Cc: Cedric Le Goater
    Cc: Herbert Poetzl
    Cc: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     
  • Remove the filesystem support logic from the cpusets system and makes cpusets
    a cgroup subsystem

    The "cpuset" filesystem becomes a dummy filesystem; attempts to mount it get
    passed through to the cgroup filesystem with the appropriate options to
    emulate the old cpuset filesystem behaviour.

    Signed-off-by: Paul Menage
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     
  • Replace the struct css_set embedded in task_struct with a pointer; all tasks
    that have the same set of memberships across all hierarchies will share a
    css_set object, and will be linked via their css_sets field to the "tasks"
    list_head in the css_set.

    Assuming that many tasks share the same cgroup assignments, this reduces
    overall space usage and keeps the size of the task_struct down (three pointers
    added to task_struct compared to a non-cgroups kernel, no matter how many
    subsystems are registered).

    [akpm@linux-foundation.org: fix a printk]
    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Paul Menage
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     
  • Generic Process Control Groups
    --------------------------

    There have recently been various proposals floating around for
    resource management/accounting and other task grouping subsystems in
    the kernel, including ResGroups, User BeanCounters, NSProxy
    cgroups, and others. These all need the basic abstraction of being
    able to group together multiple processes in an aggregate, in order to
    track/limit the resources permitted to those processes, or control
    other behaviour of the processes, and all implement this grouping in
    different ways.

    This patchset provides a framework for tracking and grouping processes
    into arbitrary "cgroups" and assigning arbitrary state to those
    groupings, in order to control the behaviour of the cgroup as an
    aggregate.

    The intention is that the various resource management and
    virtualization/cgroup efforts can also become task cgroup
    clients, with the result that:

    - the userspace APIs are (somewhat) normalised

    - it's easier to test e.g. the ResGroups CPU controller in
    conjunction with the BeanCounters memory controller, or use either of
    them as the resource-control portion of a virtual server system.

    - the additional kernel footprint of any of the competing resource
    management systems is substantially reduced, since it doesn't need
    to provide process grouping/containment, hence improving their
    chances of getting into the kernel

    This patch:

    Add the main task cgroups framework - the cgroup filesystem, and the
    basic structures for tracking membership and associating subsystem state
    objects to tasks.

    Signed-off-by: Paul Menage
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     

19 Oct, 2007

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
    sched: reduce schedstat variable overhead a bit
    sched: add KERN_CONT annotation
    sched: cleanup, make struct rq comments more consistent
    sched: cleanup, fix spacing
    sched: fix return value of wait_for_completion_interruptible()

    Linus Torvalds
     
  • This adds items to the taststats struct to account for user and system
    time based on scaling the CPU frequency and instruction issue rates.

    Adds account_(user|system)_time_scaled callbacks which architectures
    can use to account for time using this mechanism.

    Signed-off-by: Michael Neuling
    Cc: Balbir Singh
    Cc: Jay Lan
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Neuling
     
  • Hell knows what happened in commit 63b05203af57e7de4f3bb63b8b81d43bc196d32b
    during 2.6.9 development. Commit introduced io_wait field which remained
    write-only than and still remains write-only.

    Also garbage collect macros which "use" io_wait.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • schedstat is useful in investigating CPU scheduler behavior. Ideally,
    I think it is beneficial to have it on all the time. However, the
    cost of turning it on in production system is quite high, largely due
    to number of events it collects and also due to its large memory
    footprint.

    Most of the fields probably don't need to be full 64-bit on 64-bit
    arch. Rolling over 4 billion events will most like take a long time
    and user space tool can be made to accommodate that. I'm proposing
    kernel to cut back most of variable width on 64-bit system. (note,
    the following patch doesn't affect 32-bit system).

    Signed-off-by: Ken Chen
    Signed-off-by: Ingo Molnar

    Ken Chen
     

18 Oct, 2007

1 commit


17 Oct, 2007

9 commits

  • For those who don't care about CONFIG_SECURITY.

    Signed-off-by: Alexey Dobriyan
    Cc: "Serge E. Hallyn"
    Cc: Casey Schaufler
    Cc: James Morris
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • There is nice 2 byte hole after struct task_struct::ioprio field
    into which we can put two 1-byte fields: ->fpu_counter and ->oomkilladj.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • For those who deselect POSIX message queues.

    Reduces SLAB size of user_struct from 64 to 32 bytes here, SLUB size -- from
    40 bytes to 32 bytes.

    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • robust_list, compat_robust_list, pi_state_list, pi_state_cache are
    really used if futexes are on.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • oomkilladj is int, but values which can be assigned to it are -17, [-16,
    15], thus fitting into s8.

    While patch itself doesn't help in making task_struct smaller, because of
    natural alignment of ->link_count, it will make picture clearer wrt futher
    task_struct reduction patches. My plan is to move ->fpu_counter and
    ->oomkilladj after ->ioprio filling hole on i386 and x86_64. But that's
    for later, because bloated distro configs need looking at as well.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • This adds the MMF_DUMP_ELF_HEADERS option to /proc/pid/coredump_filter.
    This dumps the first page (only) of a private file mapping if it appears to
    be a mapping of an ELF file. Including these pages in the core dump may
    give sufficient identifying information to associate the original DSO and
    executable file images and their debugging information with a core file in
    a generic way just from its contents (e.g. when those binaries were built
    with ld --build-id). I expect this to become the default behavior
    eventually. Existing versions of gdb can be confused by the core dumps it
    creates, so it won't enabled by default for some time to come. Soon many
    people will have systems with a gdb that handle these dumps, so they can
    arrange to set the bit at boot and have it inherited system-wide.

    This also cleans up the checking of the MMF_DUMP_* flag bits, which did not
    need to be using atomic macros.

    Signed-off-by: Roland McGrath
    Cc: Hidehiro Kawai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • Control the trigger limit for softlockup warnings. This is useful for
    debugging softlockups, by lowering the softlockup_thresh to identify
    possible softlockups earlier.

    This patch:
    1. Adds a sysctl softlockup_thresh with valid values of 1-60s
    (Higher value to disable false positives)
    2. Changes the softlockup printk to print the cpu softlockup time

    [akpm@linux-foundation.org: Fix various warnings and add definition of "two"]
    Signed-off-by: Ravikiran Thirumalai
    Signed-off-by: Shai Fultheim
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ravikiran G Thirumalai
     
  • Based on ideas of Andrew:
    http://marc.info/?l=linux-kernel&m=102912915020543&w=2

    Scale the bdi dirty limit inversly with the tasks dirty rate.
    This makes heavy writers have a lower dirty limit than the occasional writer.

    Andrea proposed something similar:
    http://lwn.net/Articles/152277/

    The main disadvantage to his patch is that he uses an unrelated quantity to
    measure time, which leaves him with a workload dependant tunable. Other than
    that the two approaches appear quite similar.

    [akpm@linux-foundation.org: fix warning]
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • When CONFIG_SYSFS is not set, CONFIG_FAIR_USER_SCHED fails to build
    with

    kernel/built-in.o: In function `uids_kobject_init':
    (.init.text+0x1488): undefined reference to `kernel_subsys'
    kernel/built-in.o: In function `uids_kobject_init':
    (.init.text+0x1490): undefined reference to `kernel_subsys'
    kernel/built-in.o: In function `uids_kobject_init':
    (.init.text+0x1480): undefined reference to `kernel_subsys'
    kernel/built-in.o: In function `uids_kobject_init':
    (.init.text+0x1494): undefined reference to `kernel_subsys'

    This patch fixes this build error.

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar

    Dhaval Giani