03 Jan, 2008

1 commit

  • Both SLUB and SLAB really did almost exactly the same thing for
    /proc/slabinfo setup, using duplicate code and per-allocator #ifdef's.

    This just creates a common CONFIG_SLABINFO that is enabled by both SLUB
    and SLAB, and shares all the setup code. Maybe SLOB will want this some
    day too.

    Reviewed-by: Pekka Enberg
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

07 Dec, 2007

1 commit


03 Dec, 2007

1 commit

  • Commit cfb5285660aad4931b2ebbfa902ea48a37dfffa1 removed a useful feature for
    us, which provided a cpu accounting resource controller. This feature would be
    useful if someone wants to group tasks only for accounting purpose and doesnt
    really want to exercise any control over their cpu consumption.

    The patch below reintroduces the feature. It is based on Paul Menage's
    original patch (Commit 62d0df64065e7c135d0002f069444fbdfc64768f), with
    these differences:

    - Removed load average information. I felt it needs more thought (esp
    to deal with SMP and virtualized platforms) and can be added for
    2.6.25 after more discussions.
    - Convert group cpu usage to be nanosecond accurate (as rest of the cfs
    stats are) and invoke cpuacct_charge() from the respective scheduler
    classes
    - Make accounting scalable on SMP systems by splitting the usage
    counter to be per-cpu
    - Move the code from kernel/cpu_acct.c to kernel/sched.c (since the
    code is not big enough to warrant a new file and also this rightly
    needs to live inside the scheduler. Also things like accessing
    rq->lock while reading cpu usage becomes easier if the code lived in
    kernel/sched.c)

    The patch also modifies the cpu controller not to provide the same accounting
    information.

    Tested-by: Balbir Singh

    Tested the patches on top of 2.6.24-rc3. The patches work fine. Ran
    some simple tests like cpuspin (spin on the cpu), ran several tasks in
    the same group and timed them. Compared their time stamps with
    cpuacct.usage.

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Balbir Singh
    Signed-off-by: Ingo Molnar

    Srivatsa Vaddagiri
     

23 Nov, 2007

1 commit


21 Nov, 2007

1 commit

  • Add appropriate freezer annotations to handle_initrd(), so that it's possible
    to resume from disk from an initrd.

    http://bugzilla.kernel.org/show_bug.cgi?id=9345

    Signed-off-by: Rafael J. Wysocki
    Cc: Pavel Machek
    Cc: Nigel Cunningham
    Cc: Ingo Molnar
    Cc: Chris Friedhoff
    Signed-off-by: Len Brown

    Rafael J. Wysocki
     

15 Nov, 2007

2 commits

  • This is my trivial patch to swat innumerable little bugs with a single
    blow.

    After some intensive review (my apologies for not having gotten to this
    sooner) what we have looks like a good base to build on with the current
    pid namespace code but it is not complete, and it is still much to simple
    to find issues where the kernel does the wrong thing outside of the initial
    pid namespace.

    Until the dust settles and we are certain we have the ABI and the
    implementation is as correct as humanly possible let's keep process ID
    namespaces behind CONFIG_EXPERIMENTAL.

    Allowing us the option of fixing any ABI or other bugs we find as long as
    they are minor.

    Allowing users of the kernel to avoid those bugs simply by ensuring their
    kernel does not have support for multiple pid namespaces.

    [akpm@linux-foundation.org: coding-style cleanups]
    Signed-off-by: Eric W. Biederman
    Cc: Cedric Le Goater
    Cc: Adrian Bunk
    Cc: Jeremy Fitzhardinge
    Cc: Kir Kolyshkin
    Cc: Kirill Korotaev
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Revert 62d0df64065e7c135d0002f069444fbdfc64768f.

    This was originally intended as a simple initial example of how to create a
    control groups subsystem; it wasn't intended for mainline, but I didn't make
    this clear enough to Andrew.

    The CFS cgroup subsystem now has better functionality for the per-cgroup usage
    accounting (based directly on CFS stats) than the "usage" status file in this
    patch, and the "load" status file is rather simplistic - although having a
    per-cgroup load average report would be a useful feature, I don't believe this
    patch actually provides it. If it gets into the final 2.6.24 we'd probably
    have to support this interface for ever.

    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

10 Nov, 2007

1 commit


25 Oct, 2007

1 commit


21 Oct, 2007

1 commit

  • New kind of audit rule predicates: "object is visible in given subtree".
    The part that can be sanely implemented, that is. Limitations:
    * if you have hardlink from outside of tree, you'd better watch
    it too (or just watch the object itself, obviously)
    * if you mount something under a watched tree, tell audit
    that new chunk should be added to watched subtrees
    * if you umount something in a watched tree and it's still mounted
    elsewhere, you will get matches on events happening there. New command
    tells audit to recalculate the trees, trimming such sources of false
    positives.

    Note that it's _not_ about path - if something mounted in several places
    (multiple mount, bindings, different namespaces, etc.), the match does
    _not_ depend on which one we are using for access.

    Signed-off-by: Al Viro

    Al Viro
     

20 Oct, 2007

8 commits

  • Spelling fix in init/.

    Signed-off-by: Simon Arlott
    Signed-off-by: Adrian Bunk

    Simon Arlott
     
  • The header file already enforces a suitably recent
    version of gcc, so there's no point checking for that again.

    Signed-off-by: Robert P. J. Day
    Signed-off-by: Adrian Bunk

    Robert P. J. Day
     
  • Enable "cgroup" (formerly containers) based fair group scheduling. This
    will let administrator create arbitrary groups of tasks (using "cgroup"
    pseudo filesystem) and control their cpu bandwidth usage.

    [akpm@linux-foundation.org: fix cpp condition]
    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Cc: Randy Dunlap
    Cc: Balbir Singh
    Cc: Paul Menage
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Srivatsa Vaddagiri
     
  • When a task enters a new namespace via a clone() or unshare(), a new cgroup
    is created and the task moves into it.

    This version names cgroups which are automatically created using
    cgroup_clone() as "node_" where pid is the pid of the unsharing or
    cloned process. (Thanks Pavel for the idea) This is safe because if the
    process unshares again, it will create

    /cgroups/(...)/node_/node_

    The only possibilities (AFAICT) for a -EEXIST on unshare are

    1. pid wraparound
    2. a process fails an unshare, then tries again.

    Case 1 is unlikely enough that I ignore it (at least for now). In case 2, the
    node_ will be empty and can be rmdir'ed to make the subsequent unshare()
    succeed.

    Changelog:
    Name cloned cgroups as "node_".

    [clg@fr.ibm.com: fix order of cgroup subsystems in init/Kconfig]
    Signed-off-by: Serge E. Hallyn
    Cc: Paul Menage
    Signed-off-by: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     
  • This example subsystem exports debugging information as an aid to diagnosing
    refcount leaks, etc, in the cgroup framework.

    Signed-off-by: Paul Menage
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     
  • This example demonstrates how to use the generic cgroup subsystem for a
    simple resource tracker that counts, for the processes in a cgroup, the
    total CPU time used and the %CPU used in the last complete 10 second interval.

    Portions contributed by Balbir Singh

    Signed-off-by: Paul Menage
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     
  • Remove the filesystem support logic from the cpusets system and makes cpusets
    a cgroup subsystem

    The "cpuset" filesystem becomes a dummy filesystem; attempts to mount it get
    passed through to the cgroup filesystem with the appropriate options to
    emulate the old cpuset filesystem behaviour.

    Signed-off-by: Paul Menage
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     
  • Generic Process Control Groups
    --------------------------

    There have recently been various proposals floating around for
    resource management/accounting and other task grouping subsystems in
    the kernel, including ResGroups, User BeanCounters, NSProxy
    cgroups, and others. These all need the basic abstraction of being
    able to group together multiple processes in an aggregate, in order to
    track/limit the resources permitted to those processes, or control
    other behaviour of the processes, and all implement this grouping in
    different ways.

    This patchset provides a framework for tracking and grouping processes
    into arbitrary "cgroups" and assigning arbitrary state to those
    groupings, in order to control the behaviour of the cgroup as an
    aggregate.

    The intention is that the various resource management and
    virtualization/cgroup efforts can also become task cgroup
    clients, with the result that:

    - the userspace APIs are (somewhat) normalised

    - it's easier to test e.g. the ResGroups CPU controller in
    conjunction with the BeanCounters memory controller, or use either of
    them as the resource-control portion of a virtual server system.

    - the additional kernel footprint of any of the competing resource
    management systems is substantially reduced, since it doesn't need
    to provide process grouping/containment, hence improving their
    chances of getting into the kernel

    This patch:

    Add the main task cgroups framework - the cgroup filesystem, and the
    basic structures for tracking membership and associating subsystem state
    objects to tasks.

    Signed-off-by: Paul Menage
    Cc: Serge E. Hallyn
    Cc: "Eric W. Biederman"
    Cc: Dave Hansen
    Cc: Balbir Singh
    Cc: Paul Jackson
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: Srivatsa Vaddagiri
    Cc: Cedric Le Goater
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     

19 Oct, 2007

1 commit

  • Get rid of sparse related warnings from places that use integer as NULL
    pointer.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Stephen Hemminger
    Cc: Andi Kleen
    Cc: Jeff Garzik
    Cc: Matt Mackall
    Cc: Ian Kent
    Cc: Arnd Bergmann
    Cc: Davide Libenzi
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Hemminger
     

17 Oct, 2007

5 commits

  • Kconfig.preempt is not included on some archs (for example, m68k). On those
    archs, the Kconfig machinery complains that KVM selects an undefined symbol
    PREEMPT_NOTIFIERS (which lives in Kconfig.preempt).

    So move the offending symbol into a Kconfig file which is included by
    everyone.

    Cc: Roman Zippel
    Cc: Geert Uytterhoeven
    Signed-off-by: Avi Kivity
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Avi Kivity
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (40 commits)
    kbuild: introduce ccflags-y, asflags-y and ldflags-y
    kbuild: enable 'make CPPFLAGS=...' to add additional options to CPP
    kbuild: enable use of AFLAGS and CFLAGS on commandline
    kbuild: enable 'make AFLAGS=...' to add additional options to AS
    kbuild: fix AFLAGS use in h8300 and m68knommu
    kbuild: check for wrong use of CFLAGS
    kbuild: enable 'make CFLAGS=...' to add additional options to CC
    kbuild: fix up CFLAGS usage
    kbuild: make modpost detect unterminated device id lists
    kbuild: call export_report from the Makefile
    kbuild: move Kai Germaschewski to CREDITS
    kconfig/menuconfig: distinguish between selected-by-another options and comments
    kconfig: tristate choices with mixed tristate and boolean values
    include/linux/Kbuild: remove duplicate entries
    kbuild: kill backward compatibility checks
    kbuild: kill EXTRA_ARFLAGS
    kbuild: fix documentation in makefiles.txt
    kbuild: call make once for all targets when O=.. is used
    kbuild: pass -g to assembler under CONFIG_DEBUG_INFO
    kbuild: update _shipped files for kconfig syntax cleanup
    ...

    Fix up conflicts in arch/um/sys-{x86_64,i386}/Makefile manually.

    Linus Torvalds
     
  • Grouping pages by mobility can be disabled at compile-time. This was
    considered undesirable by a number of people. However, in the current stack of
    patches, it is not a simple case of just dropping the configurable patch as it
    would cause merge conflicts. This patch backs out the configuration option.

    Signed-off-by: Mel Gorman
    Acked-by: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • The grouping mechanism has some memory overhead and a more complex allocation
    path. This patch allows the strategy to be disabled for small memory systems
    or if it is known the workload is suffering because of the strategy. It also
    acts to show where the page groupings strategy interacts with the standard
    buddy allocator.

    Signed-off-by: Mel Gorman
    Signed-off-by: Joel Schopp
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Optionally add a boot delay after each kernel printk() call, crudely
    measured in milliseconds, with a maximum delay of 10 seconds per printk.

    Enable CONFIG_BOOT_PRINTK_DELAY=y and then add (e.g.):
    "lpj=loops_per_jiffy boot_delay=100"
    to the kernel command line.

    It has been useful in cases like "during boot, my machine just reboots or the
    screen goes black" by slowing down printk, (and adding initcall_debug), we can
    usually see the last thing that happened before the lights went out which is
    usually a valuable clue.

    [akpm@linux-foundation.org: not all architectures implement CONFIG_HZ]
    [akpm@linux-foundation.org: fix lots of stuff]
    [bunk@stusta.de: kernel/printk.c: make 2 variables static]
    [heiko.carstens@de.ibm.com: fix slow down printk on boot compile error]
    Signed-off-by: Randy Dunlap
    Signed-off-by: Dave Jones
    Signed-off-by: Adrian Bunk
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

15 Oct, 2007

7 commits

  • Fix coding style issues reported by Randy Dunlap and others

    Signed-off-by: Dhaval Giani
    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Ingo Molnar
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • enable CONFIG_FAIR_GROUP_SCHED=y by default.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • fair-group sched, cleanups.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     
  • Enable user-id based fair group scheduling. This is useful for anyone
    who wants to test the group scheduler w/o having to enable
    CONFIG_CGROUPS.

    A separate scheduling group (i.e struct task_grp) is automatically created for
    every new user added to the system. Upon uid change for a task, it is made to
    move to the corresponding scheduling group.

    A /proc tunable (/proc/root_user_share) is also provided to tune root
    user's quota of cpu bandwidth.

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • With the view of supporting user-id based fair scheduling (and not just
    container-based fair scheduling), this patch renames several functions
    and makes them independent of whether they are being used for container
    or user-id based fair scheduling.

    Also fix a problem reported by KAMEZAWA Hiroyuki (wrt allocating
    less-sized array for tg->cfs_rq[] and tf->se[]).

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Reviewed-by: Thomas Gleixner

    Srivatsa Vaddagiri
     
  • Add interface to control cpu bandwidth allocation to task-groups.

    (not yet configurable, due to missing CONFIG_CONTAINERS)

    Signed-off-by: Srivatsa Vaddagiri
    Signed-off-by: Dhaval Giani
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Srivatsa Vaddagiri
     
  • The variable CFLAGS is a wellknown variable and the usage by
    kbuild may result in unexpected behaviour.
    On top of that several people over time has asked for a way to
    pass in additional flags to gcc.

    This patch replace use of CFLAGS with KBUILD_CFLAGS all over the
    tree and enabling one to use:
    make CFLAGS=...
    to specify additional gcc commandline options.

    One usecase is when trying to find gcc bugs but other
    use cases has been requested too.

    Patch was tested on following architectures:
    alpha, arm, i386, x86_64, mips, sparc, sparc64, ia64, m68k

    Test was simple to do a defconfig build, apply the patch and check
    that nothing got rebuild.

    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     

20 Sep, 2007

2 commits

  • There is still some confusion and disagreement over what this interface should
    actually do. So it is best that we disable it in 2.6.23 until we get that
    fully sorted out.

    (sys_timerfd() was present in 2.6.22 but it was apparently broken, so here we
    assume that nobody is using it yet).

    Cc: Michael Kerrisk
    Cc: Davide Libenzi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Commit 831441862956fffa17b9801db37e6ea1650b0f69 (Freezer: make kernel
    threads nonfreezable by default) breaks freezing when attempting to resume
    from an initrd, because the init (which is freezeable) spins while waiting
    for another thread to run /linuxrc, but doesn't check whether it has been
    told to enter the refrigerator. The original patch replaced a call to
    try_to_freeze() with a call to yield(). I believe a simple reversion is
    wrong because if !CONFIG_PM_SLEEP, try_to_freeze() is a noop. It should
    still yield.

    Signed-off-by: Nigel Cunningham
    Acked-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nigel Cunningham
     

31 Aug, 2007

1 commit

  • Alexey Dobriyan reports that maxcpus=1 is still broken in 2.6.23-rc4:
    if CONFIG_HOTPLUG_CPU is not set, x86_64 bootup oopses in show_stat() -
    for_each_possible_cpu accesses a per-cpu area which was never set up.

    Alexey identified commit 61ec7567db103d537329b0db9a887db570431ff4
    (ACPI: boot correctly with "nosmp" or "maxcpus=0") as the origin;
    but it's not really to blame, just exposes a bug in 2.6.23-rc1's commit
    8b3b295502444340dd0701855ac422fbf32e161d (Especially when !CONFIG_HOTPLUG_CPU,
    avoid needlessy allocating resources for CPUs that can never become available).

    rc1's test for max_cpus < 2 in start_kernel() wasn't working because
    max_cpus was still NR_CPUS at that point: until rc4 moved the maxcpus
    parsing earlier. Now it sets cpu_possible_map to 1 before allocating
    all possible per-cpu areas; then smp_init() expands cpu_possible_map
    to cpu_present_map (0xf in my case) later on.

    rc1's commit has good intentions, but expects cpu_present_map to be
    limited by maxcpus, which is only the case on i386. cpus_and(possible,
    possible,present) might be good, but needs an audit of cpu_present_map
    uses - there may well be assumptions that any cpu present is possible.

    So stay safe for now and just revert those #ifndef CONFIG_HOTPLUG_CPU
    optimizations in rc1's commit.

    Signed-off-by: Hugh Dickins
    Cc: Alexey Dobriyan
    Cc: Len Brown
    Cc: Andrew Morton
    Cc: Jan Beulich
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

28 Aug, 2007

1 commit

  • Commit 61ec7567db103d537329b0db9a887db570431ff4 ('ACPI: boot correctly
    with "nosmp" or "maxcpus=0"') broke 'maxcpus=' handling on x86[-64].

    maxcpus=N is now having no effect on x86_64, and freezing bootup on i386
    (because of inconsistency with the separate maxcpus parsing down in
    arch/i386, I guess). That's because early_param parsing is a little
    different from __setup parsing, and needs the "=" omitted: then it seems
    to work as the original commit intended (no mention of IO-APIC in
    /proc/interrupts when maxcpus=0).

    Signed-off-by: Hugh Dickins
    Cc: Andrew Morton
    Cc: Len Brown
    Cc: Andi Kleen
    Cc: Rusty Russell
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

21 Aug, 2007

1 commit

  • In MPS mode, "nosmp" and "maxcpus=0" boot a UP kernel with IOAPIC disabled.
    However, in ACPI mode, these parameters didn't completely disable
    the IO APIC initialization code and boot failed.

    init/main.c:
    Disable the IO_APIC if "nosmp" or "maxcpus=0"
    undefine disable_ioapic_setup() when it doesn't apply.

    i386:
    delete ioapic_setup(), it was a duplicate of parse_noapic()
    delete undefinition of disable_ioapic_setup()

    x86_64:
    rename disable_ioapic_setup() to parse_noapic() to match i386
    define disable_ioapic_setup() in header to match i386

    http://bugzilla.kernel.org/show_bug.cgi?id=1641

    Acked-by: Andi Kleen
    Signed-off-by: Len Brown

    Len Brown
     

01 Aug, 2007

2 commits


31 Jul, 2007

1 commit

  • * master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6.23:
    sh: Fix fs.h removal from mm.h regressions.
    sh: fix get_wchan() for SH kernels without framepointers
    sh: arch/sh/boot - fix shell usage
    rtc: rtc-sh: Correct sh_rtc_set_time() for some SH-3 parts.
    sh: remove support for sh7300 and solution engine 7300
    sh: Add sh to the CC_OPTIMIZE_FOR_SIZE dependencies.
    sh: Kill off virt_to_bus()/bus_to_virt().
    sh: sh-sci - fix SH7708 support
    sh: Restrict DSP support to specific CPUs.
    sh: Silence sq compile warning on sh4 nommu.
    sh: Kill the rest of the SE73180 cruft.
    sh: remove support for sh73180 and solution engine 73180
    sh: remove old broken pint code
    sh: Reclaim beginning of P3 space for vmalloc area.
    sh: Fix Dreamcast DMA issues.
    sh: Add kmap_coherent()/kunmap_coherent() interface for SH-4.

    Linus Torvalds
     

27 Jul, 2007

1 commit