20 Apr, 2008
1 commit
-
Viktor was nice enough to enhance the document based on my replies to
his questions on the subject.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
14 Apr, 2008
1 commit
-
The per node counters are used mainly for showing data through the sysfs API.
If that API is not compiled in then there is no point in keeping track of this
data. Disable counters for the number of slabs and the number of total slabs
if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
potentially contended cacheline so this could actually be a performance
benefit to embedded systems.SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
is on by default).Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
if the system is not compiled with NUMA support.[penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
Signed-off-by: Christoph Lameter
Signed-off-by: Pekka Enberg
11 Mar, 2008
1 commit
-
The original preemptible-RCU patch put the choice between classic and
preemptible RCU into kernel/Kconfig.preempt, which resulted in build failures
on machines not supporting CONFIG_PREEMPT. This choice was therefore moved to
init/Kconfig, which worked, but placed the choice between classic and
preemptible RCU at the top level, a very obtuse choice indeed.This patch changes from the Kconfig "choice" mechanism to a pair of booleans,
only one of which (CONFIG_PREEMPT_RCU) is user-visible, and is located in
kernel/Kconfig.preempt, where one would expect it to be. The other
(CONFIG_CLASSIC_RCU) is in init/Kconfig so that it is available to all
architectures, hopefully avoiding build breakage. Thanks to Roman Zippel for
suggesting this approach.Signed-off-by: Paul E. McKenney
Cc: Ingo Molnar
Acked-by: Steven Rostedt
Cc: Dipankar Sarma
Cc: Josh Triplett
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Roman Zippel
Cc: Sam Ravnborg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Mar, 2008
4 commits
-
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
debugfs: fix sparse warnings
Driver core: Fix cleanup when failing device_add().
driver core: Remove dpm_sysfs_remove() from error path of device_add()
PM: fix new mutex-locking bug in the PM core
PM: Do not acquire device semaphores upfront during suspend
kobject: properly initialize ksets
sysfs: CONFIG_SYSFS_DEPRECATED fix
driver core: fix up Kconfig text for CONFIG_SYSFS_DEPRECATED -
Rename Memory Controller to Memory Resource Controller. Reflect the same
changes in the CONFIG definition for the Memory Resource Controller. Group
together the config options for Resource Counters and Memory Resource
Controller.Signed-off-by: Balbir Singh
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
CONFIG_SYSFS_DEPRECATED=y changed its meaning recently and causes
regressions in working setups that had SYSFS_DEPRECATED disabled.so rename it to SYSFS_DEPRECATED_V2 so that testers pick up the new
default via 'make oldconfig', even if their old .config's disabled
CONFIG_SYSFS_DEPRECATED ...Signed-off-by: Ingo Molnar
Cc: Kay Sievers
Cc: Linus Torvalds
Cc: Andrew Morton
Signed-off-by: Greg Kroah-Hartman -
As things get moved into this config option, the hard date of 2006 does
not work anymore, so update the text to be more descriptive.Cc: Kay Sievers
Cc: Jiri Slaby
Signed-off-by: Greg Kroah-Hartman
24 Feb, 2008
1 commit
-
Document huge memory/cache overhead of memory controller in Kconfig
I was a little surprised that 2.6.25-rc* increased struct page for the
memory controller. At least on many x86-64 machines it will not fit into a
single cache line now anymore and also costs considerable amounts of RAM.
At earlier review I remembered asking for a external data structure for
this.It's also quite unobvious that a innocent looking Kconfig option with a
single line Kconfig description has such a negative effect.This patch attempts to document these disadvantages at least so that users
configuring their kernel can make a informed decision.Signed-off-by: Andi Kleen
Cc: Balbir Singh
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Feb, 2008
1 commit
-
Make the rt group scheduler compile time configurable.
Keep it experimental for now.Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar
10 Feb, 2008
1 commit
-
Signed-off-by: Ingo Molnar
Signed-off-by: Thomas Gleixner
09 Feb, 2008
5 commits
-
Just like with the user namespaces, move the namespace management code into
the separate .c file and mark the (already existing) PID_NS option as "depend
on NAMESPACES"[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make the user_namespace.o compilation depend on this option and move the
init_user_ns into user.c file to make the kernel compile and work without the
namespaces support. This make the user namespace code be organized similar to
other namespaces'.Also mask the USER_NS option as "depend on NAMESPACES".
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently the IPC namespace management code is spread over the ipc/*.c files.
I moved this code into ipc/namespace.c file which is compiled out when needed.The linux/ipc_namespace.h file is used to store the prototypes of the
functions in namespace.c and the stubs for NAMESPACES=n case. This is done
so, because the stub for copy_ipc_namespace requires the knowledge of the
CLONE_NEWIPC flag, which is in sched.h. But the linux/ipc.h file itself in
included into many many .c files via the sys.h->sem.h sequence so adding the
sched.h into it will make all these .c depend on sched.h which is not that
good. On the other hand the knowledge about the namespaces stuff is required
in 4 .c files only.Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
msg.c and shm.c files. It turned out that moving these functions into
namespaces.c is not that easy because they use many other calls and macros
from the original file. Moving them would make this patch complicated. On
the other hand all these functions can be consolidated, so I will send a
separate patch doing this a bit later.Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently all the namespace management code is in the kernel/utsname.c file,
so just compile it out and make stubs in the appropriate header.The init namespace itself is in init/version.c and is in the kernel all the
time.Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The option is selectable if EMBEDDED is chosen only. When the EMBEDDED is off
namespaces will be on.Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Feb, 2008
2 commits
-
Setup the memory cgroup and add basic hooks and controls to integrate
and work with the cgroup.Signed-off-by: Balbir Singh
Cc: Pavel Emelianov
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: David Rientjes
Cc: Vaidyanathan Srinivasan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With fixes from David Rientjes
Introduce generic structures and routines for resource accounting.
Each resource accounting cgroup is supposed to aggregate it,
cgroup_subsystem_state and its resource-specific members within.Signed-off-by: Pavel Emelianov
Signed-off-by: Balbir Singh
Cc: Paul Menage
Cc: Peter Zijlstra
Cc: "Eric W. Biederman"
Cc: Nick Piggin
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Vaidyanathan Srinivasan
Signed-off-by: David Rientjes
Cc: Pavel Emelianov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Feb, 2008
1 commit
-
based on similar patch from: Pavel Machek
Introduce CONFIG_COMPAT_BRK. If disabled then the kernel is free
(but not obliged to) randomize the brk area.Heap randomization breaks ancient binaries, so we keep COMPAT_BRK
enabled by default.Signed-off-by: Ingo Molnar
06 Feb, 2008
3 commits
-
Signed-off-by: Matt Mackall
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make /proc/ page monitoring configurable
This puts the following files under an embedded config option:
/proc/pid/clear_refs
/proc/pid/smaps
/proc/pid/pagemap
/proc/kpagecount
/proc/kpageflags[akpm@linux-foundation.org: Kconfig fix]
Signed-off-by: Matt Mackall
Cc: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove the broken status to CONFIG_TIMERFD.
Signed-off-by: Davide Libenzi
Cc: Michael Kerrisk
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Feb, 2008
2 commits
-
Move the instrumentation Kconfig to
arch/Kconfig for architecture dependent options
- oprofile
- kprobesand
init/Kconfig for architecture independent options
- profiling
- markersRemove the "Instrumentation Support" menu. Everything moves to "General setup".
Delete the kernel/Kconfig.instrumentation file.Signed-off-by: Mathieu Desnoyers
Cc: Linus Torvalds
Cc:
Signed-off-by: Sam Ravnborg -
Puts the content of arch/Kconfig in the "General setup" menu.
Linus:
> Should it come with a re-duplication of it's content into each
> architecture, which was the case previously ? The oprofile and kprobes
> menu entries were litteraly cut and pasted from one architecture to
> another. Should we put its content in init/Kconfig then ?I don't think it's a good idea to go back to making it per-architecture,
although that extensive "depends on " might
indicate that there certainly is room for cleanup there.And I don't think it's wrong keeping it in kernel/Kconfig.xyz per se, I
just think it's wrong to (a) lump the code together when it really doesn't
necessarily need to and (b) show it to users as some kind of choice that
is tied together (whether it then has common code or not).On the per-architecture side, I do think it would be better to *not* have
internal architecture knowledge in a generic file, and as such a line likedepends on X86_32 || IA64 || PPC || S390 || SPARC64 || X86_64 || AVR32
really shouldn't exist in a file like kernel/Kconfig.instrumentation.
It would be much better to do
depends on ARCH_SUPPORTS_KPROBES
in that generic file, and then architectures that do support it would just
have abool ARCH_SUPPORTS_KPROBES
default yin *their* architecture files. That would seem to be much more logical,
and is readable both for arch maintainers *and* for people who have no
clue - and don't care - about which architecture is supposed to support
which interface...Sam Ravnborg:
Stuff it into a new file: arch/Kconfig
We can then extend this file to include all the 'trailing'
Kconfig things that are anyway equal for all ARCHs.But it should be kept clean - so if we introduce such a file
then we should use ARCH_HAS_whatever in the arch specific Kconfig
files to enable stuff that is not shared.[...]
The above suggestion is actually not exactly the best way to do it...
First the naming..
A quick grep shows following usage today (in Kconfig files)
ARCH_HAS 51
ARCH_SUPPORTS 4
HAVE_ARCH 7ARCH_HAS is the clear winner.
In the common Kconfig file do:
config FOO
depends on ARCH_HAS_FOO
bool "bla bla"config ARCH_HAS_FOO
def_bool nIn the arch specific Kconfig file in a suitable place do:
config SUITABLE_OPTION
select ARCH_HAS_FOOThe naming of ARCH_HAS_ is fixed and shall be:
ARCH_HAS_Only a single line added pr. architecture.
And we will end up with a (maybe even commented) list of trivial selects.- Yet another update :
Moving to HAVE_* now.
Signed-off-by: Mathieu Desnoyers
Cc: Jeff Dike
Cc: David Howells
Cc: Ananth N Mavinakayanahalli
Signed-off-by: Sam Ravnborg
01 Feb, 2008
1 commit
-
This patch supplies help text for the "RCU implementation type"
kernel configuration choice.Reported-by: Stefan Richter
Signed-off-by: Paul E. McKenney
Signed-off-by: Ingo Molnar
29 Jan, 2008
1 commit
-
Use the environment option to provide the ARCH symbol
and the KERNELVERSION symbol.Signed-off-by: Roman Zippel
Signed-off-by: Sam Ravnborg
28 Jan, 2008
1 commit
-
Support syscall auditing..
Signed-off-by: Yuichi Nakamura
Signed-off-by: Paul Mundt
26 Jan, 2008
1 commit
-
This patch implements a new version of RCU which allows its read-side
critical sections to be preempted. It uses a set of counter pairs
to keep track of the read-side critical sections and flips them
when all tasks exit read-side critical section. The details
of this implementation can be found in this paper -http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf
and the article-
http://lwn.net/Articles/253651/
This patch was developed as a part of the -rt kernel development and
meant to provide better latencies when read-side critical sections of
RCU don't disable preemption. As a consequence of keeping track of RCU
readers, the readers have a slight overhead (optimizations in the paper).
This implementation co-exists with the "classic" RCU implementations
and can be switched to at compiler.Also includes RCU tracing summarized in debugfs.
[ akpm@linux-foundation.org: build fixes on non-preempt architectures ]
Signed-off-by: Gautham R Shenoy
Signed-off-by: Dipankar Sarma
Signed-off-by: Paul E. McKenney
Reviewed-by: Steven Rostedt
Signed-off-by: Ingo Molnar
25 Jan, 2008
1 commit
-
Make SYSFS_DEPRECATED depend on SYSFS since files that check
CONFIG_SYSFS_DEPRECATED don't check for CONFIG_SYSFS first.
Also don't prompt user about SYSFS_DEPRECATED if SYSFS=n.Signed-off-by: Randy Dunlap
Signed-off-by: Greg Kroah-Hartman
03 Jan, 2008
1 commit
-
Both SLUB and SLAB really did almost exactly the same thing for
/proc/slabinfo setup, using duplicate code and per-allocator #ifdef's.This just creates a common CONFIG_SLABINFO that is enabled by both SLUB
and SLAB, and shares all the setup code. Maybe SLOB will want this some
day too.Reviewed-by: Pekka Enberg
Signed-off-by: Linus Torvalds
03 Dec, 2007
1 commit
-
Commit cfb5285660aad4931b2ebbfa902ea48a37dfffa1 removed a useful feature for
us, which provided a cpu accounting resource controller. This feature would be
useful if someone wants to group tasks only for accounting purpose and doesnt
really want to exercise any control over their cpu consumption.The patch below reintroduces the feature. It is based on Paul Menage's
original patch (Commit 62d0df64065e7c135d0002f069444fbdfc64768f), with
these differences:- Removed load average information. I felt it needs more thought (esp
to deal with SMP and virtualized platforms) and can be added for
2.6.25 after more discussions.
- Convert group cpu usage to be nanosecond accurate (as rest of the cfs
stats are) and invoke cpuacct_charge() from the respective scheduler
classes
- Make accounting scalable on SMP systems by splitting the usage
counter to be per-cpu
- Move the code from kernel/cpu_acct.c to kernel/sched.c (since the
code is not big enough to warrant a new file and also this rightly
needs to live inside the scheduler. Also things like accessing
rq->lock while reading cpu usage becomes easier if the code lived in
kernel/sched.c)The patch also modifies the cpu controller not to provide the same accounting
information.Tested-by: Balbir Singh
Tested the patches on top of 2.6.24-rc3. The patches work fine. Ran
some simple tests like cpuspin (spin on the cpu), ran several tasks in
the same group and timed them. Compared their time stamps with
cpuacct.usage.Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Balbir Singh
Signed-off-by: Ingo Molnar
23 Nov, 2007
1 commit
-
Signed-off-by: Mike Frysinger
Signed-off-by: Bryan Wu
15 Nov, 2007
2 commits
-
This is my trivial patch to swat innumerable little bugs with a single
blow.After some intensive review (my apologies for not having gotten to this
sooner) what we have looks like a good base to build on with the current
pid namespace code but it is not complete, and it is still much to simple
to find issues where the kernel does the wrong thing outside of the initial
pid namespace.Until the dust settles and we are certain we have the ABI and the
implementation is as correct as humanly possible let's keep process ID
namespaces behind CONFIG_EXPERIMENTAL.Allowing us the option of fixing any ABI or other bugs we find as long as
they are minor.Allowing users of the kernel to avoid those bugs simply by ensuring their
kernel does not have support for multiple pid namespaces.[akpm@linux-foundation.org: coding-style cleanups]
Signed-off-by: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Adrian Bunk
Cc: Jeremy Fitzhardinge
Cc: Kir Kolyshkin
Cc: Kirill Korotaev
Cc: Pavel Emelyanov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Revert 62d0df64065e7c135d0002f069444fbdfc64768f.
This was originally intended as a simple initial example of how to create a
control groups subsystem; it wasn't intended for mainline, but I didn't make
this clear enough to Andrew.The CFS cgroup subsystem now has better functionality for the per-cgroup usage
accounting (based directly on CFS stats) than the "usage" status file in this
patch, and the "load" status file is rather simplistic - although having a
per-cgroup load average report would be a useful feature, I don't believe this
patch actually provides it. If it gets into the final 2.6.24 we'd probably
have to support this interface for ever.Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Oct, 2007
1 commit
-
mark CONFIG_FAIR_GROUP_SCHED as !EXPERIMENTAL. All bugs have been
fixed and it's perfect ;-)Signed-off-by: Ingo Molnar
21 Oct, 2007
1 commit
-
New kind of audit rule predicates: "object is visible in given subtree".
The part that can be sanely implemented, that is. Limitations:
* if you have hardlink from outside of tree, you'd better watch
it too (or just watch the object itself, obviously)
* if you mount something under a watched tree, tell audit
that new chunk should be added to watched subtrees
* if you umount something in a watched tree and it's still mounted
elsewhere, you will get matches on events happening there. New command
tells audit to recalculate the trees, trimming such sources of false
positives.Note that it's _not_ about path - if something mounted in several places
(multiple mount, bindings, different namespaces, etc.), the match does
_not_ depend on which one we are using for access.Signed-off-by: Al Viro
20 Oct, 2007
5 commits
-
Enable "cgroup" (formerly containers) based fair group scheduling. This
will let administrator create arbitrary groups of tasks (using "cgroup"
pseudo filesystem) and control their cpu bandwidth usage.[akpm@linux-foundation.org: fix cpp condition]
Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Cc: Randy Dunlap
Cc: Balbir Singh
Cc: Paul Menage
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When a task enters a new namespace via a clone() or unshare(), a new cgroup
is created and the task moves into it.This version names cgroups which are automatically created using
cgroup_clone() as "node_" where pid is the pid of the unsharing or
cloned process. (Thanks Pavel for the idea) This is safe because if the
process unshares again, it will create/cgroups/(...)/node_/node_
The only possibilities (AFAICT) for a -EEXIST on unshare are
1. pid wraparound
2. a process fails an unshare, then tries again.Case 1 is unlikely enough that I ignore it (at least for now). In case 2, the
node_ will be empty and can be rmdir'ed to make the subsequent unshare()
succeed.Changelog:
Name cloned cgroups as "node_".[clg@fr.ibm.com: fix order of cgroup subsystems in init/Kconfig]
Signed-off-by: Serge E. Hallyn
Cc: Paul Menage
Signed-off-by: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This example subsystem exports debugging information as an aid to diagnosing
refcount leaks, etc, in the cgroup framework.Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This example demonstrates how to use the generic cgroup subsystem for a
simple resource tracker that counts, for the processes in a cgroup, the
total CPU time used and the %CPU used in the last complete 10 second interval.Portions contributed by Balbir Singh
Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove the filesystem support logic from the cpusets system and makes cpusets
a cgroup subsystemThe "cpuset" filesystem becomes a dummy filesystem; attempts to mount it get
passed through to the cgroup filesystem with the appropriate options to
emulate the old cpuset filesystem behaviour.Signed-off-by: Paul Menage
Cc: Serge E. Hallyn
Cc: "Eric W. Biederman"
Cc: Dave Hansen
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Kirill Korotaev
Cc: Herbert Poetzl
Cc: Srivatsa Vaddagiri
Cc: Cedric Le Goater
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds