20 Oct, 2007
5 commits
-
When a task enters a new namespace via a clone() or unshare(), a new cgroup
is created and the task moves into it.This version names cgroups which are automatically created using
cgroup_clone() as "node_" where pid is the pid of the unsharing or
cloned process. (Thanks Pavel for the idea) This is safe because if the
process unshares again, it will create/cgroups/(...)/node_/node_
The only possibilities (AFAICT) for a -EEXIST on unshare are
1. pid wraparound
2. a process fails an unshare, then tries again.Case 1 is unlikely enough that I ignore it (at least for now). In case 2, the
node_ will be empty and can be rmdir'ed to make the subsequent unshare()
succeed.Changelog:
Name cloned cgroups as "node_".[clg@fr.ibm.com: fix order of cgroup subsystems in init/Kconfig]
Signed-off-by: Serge E. Hallyn
Cc: Paul Menage
Signed-off-by: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This example subsystem exports debugging information as an aid to diagnosing
refcount leaks, etc, in the cgroup framework.Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This example demonstrates how to use the generic cgroup subsystem for a
simple resource tracker that counts, for the processes in a cgroup, the
total CPU time used and the %CPU used in the last complete 10 second interval.Portions contributed by Balbir Singh
Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove the filesystem support logic from the cpusets system and makes cpusets
a cgroup subsystemThe "cpuset" filesystem becomes a dummy filesystem; attempts to mount it get
passed through to the cgroup filesystem with the appropriate options to
emulate the old cpuset filesystem behaviour.Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Generic Process Control Groups
--------------------------There have recently been various proposals floating around for
resource management/accounting and other task grouping subsystems in
the kernel, including ResGroups, User BeanCounters, NSProxy
cgroups, and others. These all need the basic abstraction of being
able to group together multiple processes in an aggregate, in order to
track/limit the resources permitted to those processes, or control
other behaviour of the processes, and all implement this grouping in
different ways.This patchset provides a framework for tracking and grouping processes
into arbitrary "cgroups" and assigning arbitrary state to those
groupings, in order to control the behaviour of the cgroup as an
aggregate.The intention is that the various resource management and
virtualization/cgroup efforts can also become task cgroup
clients, with the result that:- the userspace APIs are (somewhat) normalised
- it's easier to test e.g. the ResGroups CPU controller in
conjunction with the BeanCounters memory controller, or use either of
them as the resource-control portion of a virtual server system.- the additional kernel footprint of any of the competing resource
management systems is substantially reduced, since it doesn't need
to provide process grouping/containment, hence improving their
chances of getting into the kernelThis patch:
Add the main task cgroups framework - the cgroup filesystem, and the
basic structures for tracking membership and associating subsystem state
objects to tasks.Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Oct, 2007
1 commit
-
Get rid of sparse related warnings from places that use integer as NULL
pointer.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Stephen Hemminger
Cc: Andi Kleen
Cc: Jeff Garzik
Cc: Matt Mackall
Cc: Ian Kent
Cc: Arnd Bergmann
Cc: Davide Libenzi
Cc: Stephen Smalley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Oct, 2007
5 commits
-
Kconfig.preempt is not included on some archs (for example, m68k). On those
archs, the Kconfig machinery complains that KVM selects an undefined symbol
PREEMPT_NOTIFIERS (which lives in Kconfig.preempt).So move the offending symbol into a Kconfig file which is included by
everyone.Cc: Roman Zippel
Cc: Geert Uytterhoeven
Signed-off-by: Avi Kivity
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (40 commits)
kbuild: introduce ccflags-y, asflags-y and ldflags-y
kbuild: enable 'make CPPFLAGS=...' to add additional options to CPP
kbuild: enable use of AFLAGS and CFLAGS on commandline
kbuild: enable 'make AFLAGS=...' to add additional options to AS
kbuild: fix AFLAGS use in h8300 and m68knommu
kbuild: check for wrong use of CFLAGS
kbuild: enable 'make CFLAGS=...' to add additional options to CC
kbuild: fix up CFLAGS usage
kbuild: make modpost detect unterminated device id lists
kbuild: call export_report from the Makefile
kbuild: move Kai Germaschewski to CREDITS
kconfig/menuconfig: distinguish between selected-by-another options and comments
kconfig: tristate choices with mixed tristate and boolean values
include/linux/Kbuild: remove duplicate entries
kbuild: kill backward compatibility checks
kbuild: kill EXTRA_ARFLAGS
kbuild: fix documentation in makefiles.txt
kbuild: call make once for all targets when O=.. is used
kbuild: pass -g to assembler under CONFIG_DEBUG_INFO
kbuild: update _shipped files for kconfig syntax cleanup
...Fix up conflicts in arch/um/sys-{x86_64,i386}/Makefile manually.
-
Grouping pages by mobility can be disabled at compile-time. This was
considered undesirable by a number of people. However, in the current stack of
patches, it is not a simple case of just dropping the configurable patch as it
would cause merge conflicts. This patch backs out the configuration option.Signed-off-by: Mel Gorman
Acked-by: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The grouping mechanism has some memory overhead and a more complex allocation
path. This patch allows the strategy to be disabled for small memory systems
or if it is known the workload is suffering because of the strategy. It also
acts to show where the page groupings strategy interacts with the standard
buddy allocator.Signed-off-by: Mel Gorman
Signed-off-by: Joel Schopp
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Optionally add a boot delay after each kernel printk() call, crudely
measured in milliseconds, with a maximum delay of 10 seconds per printk.Enable CONFIG_BOOT_PRINTK_DELAY=y and then add (e.g.):
"lpj=loops_per_jiffy boot_delay=100"
to the kernel command line.It has been useful in cases like "during boot, my machine just reboots or the
screen goes black" by slowing down printk, (and adding initcall_debug), we can
usually see the last thing that happened before the lights went out which is
usually a valuable clue.[akpm@linux-foundation.org: not all architectures implement CONFIG_HZ]
[akpm@linux-foundation.org: fix lots of stuff]
[bunk@stusta.de: kernel/printk.c: make 2 variables static]
[heiko.carstens@de.ibm.com: fix slow down printk on boot compile error]
Signed-off-by: Randy Dunlap
Signed-off-by: Dave Jones
Signed-off-by: Adrian Bunk
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Oct, 2007
7 commits
-
Fix coding style issues reported by Randy Dunlap and others
Signed-off-by: Dhaval Giani
Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner -
enable CONFIG_FAIR_GROUP_SCHED=y by default.
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
fair-group sched, cleanups.
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
Enable user-id based fair group scheduling. This is useful for anyone
who wants to test the group scheduler w/o having to enable
CONFIG_CGROUPS.A separate scheduling group (i.e struct task_grp) is automatically created for
every new user added to the system. Upon uid change for a task, it is made to
move to the corresponding scheduling group.A /proc tunable (/proc/root_user_share) is also provided to tune root
user's quota of cpu bandwidth.Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
With the view of supporting user-id based fair scheduling (and not just
container-based fair scheduling), this patch renames several functions
and makes them independent of whether they are being used for container
or user-id based fair scheduling.Also fix a problem reported by KAMEZAWA Hiroyuki (wrt allocating
less-sized array for tg->cfs_rq[] and tf->se[]).Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner -
Add interface to control cpu bandwidth allocation to task-groups.
(not yet configurable, due to missing CONFIG_CONTAINERS)
Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra -
The variable CFLAGS is a wellknown variable and the usage by
kbuild may result in unexpected behaviour.
On top of that several people over time has asked for a way to
pass in additional flags to gcc.This patch replace use of CFLAGS with KBUILD_CFLAGS all over the
tree and enabling one to use:
make CFLAGS=...
to specify additional gcc commandline options.One usecase is when trying to find gcc bugs but other
use cases has been requested too.Patch was tested on following architectures:
alpha, arm, i386, x86_64, mips, sparc, sparc64, ia64, m68kTest was simple to do a defconfig build, apply the patch and check
that nothing got rebuild.Signed-off-by: Sam Ravnborg
20 Sep, 2007
2 commits
-
There is still some confusion and disagreement over what this interface should
actually do. So it is best that we disable it in 2.6.23 until we get that
fully sorted out.(sys_timerfd() was present in 2.6.22 but it was apparently broken, so here we
assume that nobody is using it yet).Cc: Michael Kerrisk
Cc: Davide Libenzi
Acked-by: Linus Torvalds
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Commit 831441862956fffa17b9801db37e6ea1650b0f69 (Freezer: make kernel
threads nonfreezable by default) breaks freezing when attempting to resume
from an initrd, because the init (which is freezeable) spins while waiting
for another thread to run /linuxrc, but doesn't check whether it has been
told to enter the refrigerator. The original patch replaced a call to
try_to_freeze() with a call to yield(). I believe a simple reversion is
wrong because if !CONFIG_PM_SLEEP, try_to_freeze() is a noop. It should
still yield.Signed-off-by: Nigel Cunningham
Acked-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 Aug, 2007
1 commit
-
Alexey Dobriyan reports that maxcpus=1 is still broken in 2.6.23-rc4:
if CONFIG_HOTPLUG_CPU is not set, x86_64 bootup oopses in show_stat() -
for_each_possible_cpu accesses a per-cpu area which was never set up.Alexey identified commit 61ec7567db103d537329b0db9a887db570431ff4
(ACPI: boot correctly with "nosmp" or "maxcpus=0") as the origin;
but it's not really to blame, just exposes a bug in 2.6.23-rc1's commit
8b3b295502444340dd0701855ac422fbf32e161d (Especially when !CONFIG_HOTPLUG_CPU,
avoid needlessy allocating resources for CPUs that can never become available).rc1's test for max_cpus < 2 in start_kernel() wasn't working because
max_cpus was still NR_CPUS at that point: until rc4 moved the maxcpus
parsing earlier. Now it sets cpu_possible_map to 1 before allocating
all possible per-cpu areas; then smp_init() expands cpu_possible_map
to cpu_present_map (0xf in my case) later on.rc1's commit has good intentions, but expects cpu_present_map to be
limited by maxcpus, which is only the case on i386. cpus_and(possible,
possible,present) might be good, but needs an audit of cpu_present_map
uses - there may well be assumptions that any cpu present is possible.So stay safe for now and just revert those #ifndef CONFIG_HOTPLUG_CPU
optimizations in rc1's commit.Signed-off-by: Hugh Dickins
Cc: Alexey Dobriyan
Cc: Len Brown
Cc: Andrew Morton
Cc: Jan Beulich
Signed-off-by: Linus Torvalds
28 Aug, 2007
1 commit
-
Commit 61ec7567db103d537329b0db9a887db570431ff4 ('ACPI: boot correctly
with "nosmp" or "maxcpus=0"') broke 'maxcpus=' handling on x86[-64].maxcpus=N is now having no effect on x86_64, and freezing bootup on i386
(because of inconsistency with the separate maxcpus parsing down in
arch/i386, I guess). That's because early_param parsing is a little
different from __setup parsing, and needs the "=" omitted: then it seems
to work as the original commit intended (no mention of IO-APIC in
/proc/interrupts when maxcpus=0).Signed-off-by: Hugh Dickins
Cc: Andrew Morton
Cc: Len Brown
Cc: Andi Kleen
Cc: Rusty Russell
Signed-off-by: Linus Torvalds
21 Aug, 2007
1 commit
-
In MPS mode, "nosmp" and "maxcpus=0" boot a UP kernel with IOAPIC disabled.
However, in ACPI mode, these parameters didn't completely disable
the IO APIC initialization code and boot failed.init/main.c:
Disable the IO_APIC if "nosmp" or "maxcpus=0"
undefine disable_ioapic_setup() when it doesn't apply.i386:
delete ioapic_setup(), it was a duplicate of parse_noapic()
delete undefinition of disable_ioapic_setup()x86_64:
rename disable_ioapic_setup() to parse_noapic() to match i386
define disable_ioapic_setup() in header to match i386http://bugzilla.kernel.org/show_bug.cgi?id=1641
Acked-by: Andi Kleen
Signed-off-by: Len Brown
01 Aug, 2007
2 commits
-
Remove the top level menu "Code maturity level options", and moves its
options into menu "General setup".This makes Kconfig less cluttered and easier to setup.
Signed-off-by: Al Boldi
Acked-by: Sam Ravnborg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There doesn't seem to be a good reason for ANON_INODES being
an user visible option.Signed-off-by: Adrian Bunk
Acked-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 Jul, 2007
1 commit
-
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6.23:
sh: Fix fs.h removal from mm.h regressions.
sh: fix get_wchan() for SH kernels without framepointers
sh: arch/sh/boot - fix shell usage
rtc: rtc-sh: Correct sh_rtc_set_time() for some SH-3 parts.
sh: remove support for sh7300 and solution engine 7300
sh: Add sh to the CC_OPTIMIZE_FOR_SIZE dependencies.
sh: Kill off virt_to_bus()/bus_to_virt().
sh: sh-sci - fix SH7708 support
sh: Restrict DSP support to specific CPUs.
sh: Silence sq compile warning on sh4 nommu.
sh: Kill the rest of the SE73180 cruft.
sh: remove support for sh73180 and solution engine 73180
sh: remove old broken pint code
sh: Reclaim beginning of P3 space for vmalloc area.
sh: Fix Dreamcast DMA issues.
sh: Add kmap_coherent()/kunmap_coherent() interface for SH-4.
27 Jul, 2007
1 commit
-
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds
26 Jul, 2007
1 commit
-
Presently we only use this with CONFIG_EXPERIMENTAL, but it is
something that can be supported commonly.Signed-off-by: Paul Mundt
18 Jul, 2007
2 commits
-
Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves. This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.The patch causes all kernel threads to be nonfreezable by default (ie. to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE. It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear. Additionally, it updates documentation to
describe the freezing of tasks more accurately.[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki
Acked-by: Nigel Cunningham
Cc: Pavel Machek
Cc: Oleg Nesterov
Cc: Gautham R Shenoy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There are some reports that 2.6.22 has SLUB as the default. Not
true!This will make SLUB the default for 2.6.23.
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Jul, 2007
8 commits
-
Especially when !CONFIG_HOTPLUG_CPU, avoid needlessy allocating resources for
CPUs that can never become available.Signed-off-by: Jan Beulich
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It's useful sometimes to disable the softlockup checker at boottime.
Especially if it triggers during a distro install.Signed-off-by: Dave Jones
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Basically, it will allow a process to unshare its user_struct table,
resetting at the same time its own user_struct and all the associated
accounting.A new root user (uid == 0) is added to the user namespace upon creation.
Such root users have full privileges and it seems that theses privileges
should be controlled through some means (process capabilities ?)The unshare is not included in this patch.
Changes since [try #4]:
- Updated get_user_ns and put_user_ns to accept NULL, and
get_user_ns to return the namespace.Changes since [try #3]:
- moved struct user_namespace to files user_namespace.{c,h}Changes since [try #2]:
- removed struct user_namespace* argument from find_user()Changes since [try #1]:
- removed struct user_namespace* argument from find_user()
- added a root_user per user namespaceSigned-off-by: Cedric Le Goater
Signed-off-by: Serge E. Hallyn
Acked-by: Pavel Emelianov
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Eric W. Biederman
Cc: Chris Wright
Cc: Stephen Smalley
Cc: James Morris
Cc: Andrew Morgan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
CONFIG_UTS_NS and CONFIG_IPC_NS have very little value as they only
deactivate the unshare of the uts and ipc namespaces and do not improve
performance.Signed-off-by: Cedric Le Goater
Acked-by: "Serge E. Hallyn"
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Cc: Pavel Emelianov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Some buses (e.g. USB and MMC) do their scanning of devices in the
background, causing a race between them and prepare_namespace(). In order
to be able to use these buses without an initrd, we now wait for the device
specified in root= to actually show up.If the device never shows up than we will hang in an infinite loop. In
order to not mess with setups that reboot on panic, the feature must be
turned on via the command line option "rootwait".[bunk@stusta.de: root_wait can become static]
Signed-off-by: Pierre Ossman
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Change menuconfig objects from "menu, config" into "menuconfig" so that the
user can disable the whole feature without entering its menu first.Signed-off-by: Jan Engelhardt
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently slob is disabled if we're using sparsemem, due to an earlier
patch from Goto-san. Slob and static sparsemem work without any trouble as
it is, and the only hiccup is a missing slab_is_available() in the case of
sparsemem extreme. With this, we're rid of the last set of restrictions
for slob usage.Signed-off-by: Paul Mundt
Acked-by: Pekka Enberg
Acked-by: Matt Mackall
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
include/asm-x86_64/serial.h. the serial8250_ports need to be probed late in
serial initializing stage. the console_init=>serial8250_console_init=>
register_console=>serial8250_console_setup will return -ENDEV, and console
ttyS0 can not be enabled at that time. need to wait till uart_add_one_port in
drivers/serial/serial_core.c to call register_console to get console ttyS0.
that is too late.Make early_uart to use early_param, so uart console can be used earlier. Make
it to be bootconsole with CON_BOOT flag, so can use console handover feature.
and it will switch to corresponding normal serial console automatically.new command line will be:
console=uart8250,io,0x3f8,9600n8
console=uart8250,mmio,0xff5e0000,115200n8
or
earlycon=uart8250,io,0x3f8,9600n8
earlycon=uart8250,mmio,0xff5e0000,115200n8it will print in very early stage:
Early serial console at I/O port 0x3f8 (options '9600n8')
console [uart0] enabled
later for console it will print:
console handover: boot [uart0] -> real [ttyS0]Signed-off-by:
Cc: Andi Kleen
Cc: Bjorn Helgaas
Cc: Russell King
Cc: Gerd Hoffmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
10 Jul, 2007
2 commits
-
"menu, endmenu" that did not get cleaned up in the block patch
[ http://lkml.org/lkml/2007/4/10/251 ]Signed-off-by: Jan Engelhardt
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe -
add the init_idle_bootup_task() callback to the bootup thread,
unused at the moment. (CFS will use it to switch the scheduling
class of the boot thread to the idle class)Signed-off-by: Ingo Molnar