Eric Lee / smarc-fsl-linux-kernel

17 Oct, 2007

5 commits

e98c32029 Move PREEMPT_NOTIFIERS into an always-included Kconfig ... Browse Code »

Kconfig.preempt is not included on some archs (for example, m68k). On those
archs, the Kconfig machinery complains that KVM selects an undefined symbol
PREEMPT_NOTIFIERS (which lives in Kconfig.preempt).

So move the offending symbol into a Kconfig file which is included by
everyone.

Cc: Roman Zippel
Cc: Geert Uytterhoeven
Signed-off-by: Avi Kivity
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Avi Kivity
2007-10-17 23:42:55 +0800
821f3eff7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (40 commits)
kbuild: introduce ccflags-y, asflags-y and ldflags-y
kbuild: enable 'make CPPFLAGS=...' to add additional options to CPP
kbuild: enable use of AFLAGS and CFLAGS on commandline
kbuild: enable 'make AFLAGS=...' to add additional options to AS
kbuild: fix AFLAGS use in h8300 and m68knommu
kbuild: check for wrong use of CFLAGS
kbuild: enable 'make CFLAGS=...' to add additional options to CC
kbuild: fix up CFLAGS usage
kbuild: make modpost detect unterminated device id lists
kbuild: call export_report from the Makefile
kbuild: move Kai Germaschewski to CREDITS
kconfig/menuconfig: distinguish between selected-by-another options and comments
kconfig: tristate choices with mixed tristate and boolean values
include/linux/Kbuild: remove duplicate entries
kbuild: kill backward compatibility checks
kbuild: kill EXTRA_ARFLAGS
kbuild: fix documentation in makefiles.txt
kbuild: call make once for all targets when O=.. is used
kbuild: pass -g to assembler under CONFIG_DEBUG_INFO
kbuild: update _shipped files for kconfig syntax cleanup
...

Fix up conflicts in arch/um/sys-{x86_64,i386}/Makefile manually.

Linus Torvalds
2007-10-17 02:23:06 +0800
ac0e5b7a6 remove PAGE_GROUP_BY_MOBILITY ... Browse Code »

Grouping pages by mobility can be disabled at compile-time. This was
considered undesirable by a number of people. However, in the current stack of
patches, it is not a simple case of just dropping the configurable patch as it
would cause merge conflicts. This patch backs out the configuration option.

Signed-off-by: Mel Gorman
Acked-by: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2007-10-17 00:43:00 +0800
b92a6edd4 Add a configure option to group pages by mobility ... Browse Code »

The grouping mechanism has some memory overhead and a more complex allocation
path. This patch allows the strategy to be disabled for small memory systems
or if it is known the workload is suffering because of the strategy. It also
acts to show where the page groupings strategy interacts with the standard
buddy allocator.

Signed-off-by: Mel Gorman
Signed-off-by: Joel Schopp
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mel Gorman
2007-10-17 00:42:59 +0800
bfe8df3d3 slow down printk during boot ... Browse Code »

Optionally add a boot delay after each kernel printk() call, crudely
measured in milliseconds, with a maximum delay of 10 seconds per printk.

Enable CONFIG_BOOT_PRINTK_DELAY=y and then add (e.g.):
"lpj=loops_per_jiffy boot_delay=100"
to the kernel command line.

It has been useful in cases like "during boot, my machine just reboots or the
screen goes black" by slowing down printk, (and adding initcall_debug), we can
usually see the last thing that happened before the lights went out which is
usually a valuable clue.

[akpm@linux-foundation.org: not all architectures implement CONFIG_HZ]
[akpm@linux-foundation.org: fix lots of stuff]
[bunk@stusta.de: kernel/printk.c: make 2 variables static]
[heiko.carstens@de.ibm.com: fix slow down printk on boot compile error]
Signed-off-by: Randy Dunlap
Signed-off-by: Dave Jones
Signed-off-by: Adrian Bunk
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2007-10-17 00:42:49 +0800

15 Oct, 2007

7 commits

fb615581c sched: group scheduler, fix coding style issues ... Browse Code »

Fix coding style issues reported by Randy Dunlap and others

Signed-off-by: Dhaval Giani
Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Ingo Molnar
Reviewed-by: Thomas Gleixner

Srivatsa Vaddagiri
2007-10-15 23:00:12 +0800
de8d585a1 sched: enable CONFIG_FAIR_GROUP_SCHED=y by default ... Browse Code »

enable CONFIG_FAIR_GROUP_SCHED=y by default.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:09 +0800
7ed2be459 sched: fair-group sched, cleanups ... Browse Code »

fair-group sched, cleanups.

Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Ingo Molnar
2007-10-15 23:00:09 +0800
24e377a83 sched: add fair-user scheduler ... Browse Code »

Enable user-id based fair group scheduling. This is useful for anyone
who wants to test the group scheduler w/o having to enable
CONFIG_CGROUPS.

A separate scheduling group (i.e struct task_grp) is automatically created for
every new user added to the system. Upon uid change for a task, it is made to
move to the corresponding scheduling group.

A /proc tunable (/proc/root_user_share) is also provided to tune root
user's quota of cpu bandwidth.

Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Srivatsa Vaddagiri
2007-10-15 23:00:09 +0800
9b5b77512 sched: clean up code under CONFIG_FAIR_GROUP_SCHED ... Browse Code »

With the view of supporting user-id based fair scheduling (and not just
container-based fair scheduling), this patch renames several functions
and makes them independent of whether they are being used for container
or user-id based fair scheduling.

Also fix a problem reported by KAMEZAWA Hiroyuki (wrt allocating
less-sized array for tg->cfs_rq[] and tf->se[]).

Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra
Reviewed-by: Thomas Gleixner

Srivatsa Vaddagiri
2007-10-15 23:00:09 +0800
29f59db3a sched: group-scheduler core ... Browse Code »

Add interface to control cpu bandwidth allocation to task-groups.

(not yet configurable, due to missing CONFIG_CONTAINERS)

Signed-off-by: Srivatsa Vaddagiri
Signed-off-by: Dhaval Giani
Signed-off-by: Ingo Molnar
Signed-off-by: Peter Zijlstra

Srivatsa Vaddagiri
2007-10-15 23:00:07 +0800
a0f97e06a kbuild: enable 'make CFLAGS=...' to add additional options to CC ... Browse Code »

The variable CFLAGS is a wellknown variable and the usage by
kbuild may result in unexpected behaviour.
On top of that several people over time has asked for a way to
pass in additional flags to gcc.

This patch replace use of CFLAGS with KBUILD_CFLAGS all over the
tree and enabling one to use:
make CFLAGS=...
to specify additional gcc commandline options.

One usecase is when trying to find gcc bugs but other
use cases has been requested too.

Patch was tested on following architectures:
alpha, arm, i386, x86_64, mips, sparc, sparc64, ia64, m68k

Test was simple to do a defconfig build, apply the patch and check
that nothing got rebuild.

Signed-off-by: Sam Ravnborg

Sam Ravnborg
2007-10-15 04:21:35 +0800

20 Sep, 2007

2 commits

e42601973 disable sys_timerfd() for 2.6.23 ... Browse Code »

There is still some confusion and disagreement over what this interface should
actually do. So it is best that we disable it in 2.6.23 until we get that
fully sorted out.

(sys_timerfd() was present in 2.6.22 but it was apparently broken, so here we
assume that nobody is using it yet).

Cc: Michael Kerrisk
Cc: Davide Libenzi
Acked-by: Linus Torvalds
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2007-09-20 02:24:18 +0800
019ad4a0a Fix failure to resume from initrds ... Browse Code »

Commit 831441862956fffa17b9801db37e6ea1650b0f69 (Freezer: make kernel
threads nonfreezable by default) breaks freezing when attempting to resume
from an initrd, because the init (which is freezeable) spins while waiting
for another thread to run /linuxrc, but doesn't check whether it has been
told to enter the refrigerator. The original patch replaced a call to
try_to_freeze() with a call to yield(). I believe a simple reversion is
wrong because if !CONFIG_PM_SLEEP, try_to_freeze() is a noop. It should
still yield.

Signed-off-by: Nigel Cunningham
Acked-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nigel Cunningham
2007-09-20 02:24:17 +0800

31 Aug, 2007

1 commit

62e6f1e8b fix maxcpus=1 oops in show_stat() ... Browse Code »

Alexey Dobriyan reports that maxcpus=1 is still broken in 2.6.23-rc4:
if CONFIG_HOTPLUG_CPU is not set, x86_64 bootup oopses in show_stat() -
for_each_possible_cpu accesses a per-cpu area which was never set up.

Alexey identified commit 61ec7567db103d537329b0db9a887db570431ff4
(ACPI: boot correctly with "nosmp" or "maxcpus=0") as the origin;
but it's not really to blame, just exposes a bug in 2.6.23-rc1's commit
8b3b295502444340dd0701855ac422fbf32e161d (Especially when !CONFIG_HOTPLUG_CPU,
avoid needlessy allocating resources for CPUs that can never become available).

rc1's test for max_cpus < 2 in start_kernel() wasn't working because
max_cpus was still NR_CPUS at that point: until rc4 moved the maxcpus
parsing earlier. Now it sets cpu_possible_map to 1 before allocating
all possible per-cpu areas; then smp_init() expands cpu_possible_map
to cpu_present_map (0xf in my case) later on.

rc1's commit has good intentions, but expects cpu_present_map to be
limited by maxcpus, which is only the case on i386. cpus_and(possible,
possible,present) might be good, but needs an audit of cpu_present_map
uses - there may well be assumptions that any cpu present is possible.

So stay safe for now and just revert those #ifndef CONFIG_HOTPLUG_CPU
optimizations in rc1's commit.

Signed-off-by: Hugh Dickins
Cc: Alexey Dobriyan
Cc: Len Brown
Cc: Andrew Morton
Cc: Jan Beulich
Signed-off-by: Linus Torvalds

Hugh Dickins
2007-08-31 12:54:31 +0800

28 Aug, 2007

1 commit

813409771 fix maxcpus=N parsing ... Browse Code »

Commit 61ec7567db103d537329b0db9a887db570431ff4 ('ACPI: boot correctly
with "nosmp" or "maxcpus=0"') broke 'maxcpus=' handling on x86[-64].

maxcpus=N is now having no effect on x86_64, and freezing bootup on i386
(because of inconsistency with the separate maxcpus parsing down in
arch/i386, I guess). That's because early_param parsing is a little
different from __setup parsing, and needs the "=" omitted: then it seems
to work as the original commit intended (no mention of IO-APIC in
/proc/interrupts when maxcpus=0).

Signed-off-by: Hugh Dickins
Cc: Andrew Morton
Cc: Len Brown
Cc: Andi Kleen
Cc: Rusty Russell
Signed-off-by: Linus Torvalds

Hugh Dickins
2007-08-28 01:27:48 +0800

21 Aug, 2007

1 commit

61ec7567d ACPI: boot correctly with "nosmp" or "maxcpus=0" ... Browse Code »

In MPS mode, "nosmp" and "maxcpus=0" boot a UP kernel with IOAPIC disabled.
However, in ACPI mode, these parameters didn't completely disable
the IO APIC initialization code and boot failed.

init/main.c:
Disable the IO_APIC if "nosmp" or "maxcpus=0"
undefine disable_ioapic_setup() when it doesn't apply.

i386:
delete ioapic_setup(), it was a duplicate of parse_noapic()
delete undefinition of disable_ioapic_setup()

x86_64:
rename disable_ioapic_setup() to parse_noapic() to match i386
define disable_ioapic_setup() in header to match i386

http://bugzilla.kernel.org/show_bug.cgi?id=1641

Acked-by: Andi Kleen
Signed-off-by: Len Brown

Len Brown
2007-08-21 12:33:35 +0800

01 Aug, 2007

2 commits

ff0cfc66c Kconfig: remove top level menu "Code maturity level options" ... Browse Code »

Remove the top level menu "Code maturity level options", and moves its
options into menu "General setup".

This makes Kconfig less cluttered and easier to setup.

Signed-off-by: Al Boldi
Acked-by: Sam Ravnborg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Al Boldi
2007-08-01 06:39:43 +0800
448e3cee8 ANON_INODES shouldn't be user visible ... Browse Code »

There doesn't seem to be a good reason for ANON_INODES being
an user visible option.

Signed-off-by: Adrian Bunk
Acked-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2007-08-01 06:39:42 +0800

31 Jul, 2007

1 commit

6a302358d Merge master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6.23 ... Browse Code »

* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6.23:
sh: Fix fs.h removal from mm.h regressions.
sh: fix get_wchan() for SH kernels without framepointers
sh: arch/sh/boot - fix shell usage
rtc: rtc-sh: Correct sh_rtc_set_time() for some SH-3 parts.
sh: remove support for sh7300 and solution engine 7300
sh: Add sh to the CC_OPTIMIZE_FOR_SIZE dependencies.
sh: Kill off virt_to_bus()/bus_to_virt().
sh: sh-sci - fix SH7708 support
sh: Restrict DSP support to specific CPUs.
sh: Silence sq compile warning on sh4 nommu.
sh: Kill the rest of the SE73180 cruft.
sh: remove support for sh73180 and solution engine 73180
sh: remove old broken pint code
sh: Reclaim beginning of P3 space for vmalloc area.
sh: Fix Dreamcast DMA issues.
sh: Add kmap_coherent()/kunmap_coherent() interface for SH-4.

Linus Torvalds
2007-07-31 12:54:37 +0800

27 Jul, 2007

1 commit

b0a5ab931 initramfs: missing __init ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2007-07-27 02:11:56 +0800

26 Jul, 2007

1 commit

32582fa46 sh: Add sh to the CC_OPTIMIZE_FOR_SIZE dependencies. ... Browse Code »

Presently we only use this with CONFIG_EXPERIMENTAL, but it is
something that can be supported commonly.

Signed-off-by: Paul Mundt

Paul Mundt
2007-07-26 14:37:44 +0800

18 Jul, 2007

2 commits

831441862 Freezer: make kernel threads nonfreezable by default ... Browse Code »

Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves. This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie. to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE. It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear. Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki
Acked-by: Nigel Cunningham
Cc: Pavel Machek
Cc: Oleg Nesterov
Cc: Gautham R Shenoy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael J. Wysocki
2007-07-18 01:23:02 +0800
a0acd8208 Make SLUB the default allocator ... Browse Code »

There are some reports that 2.6.22 has SLUB as the default. Not
true!

This will make SLUB the default for 2.6.23.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-07-18 01:23:02 +0800

17 Jul, 2007

8 commits

8b3b29550 adjust nosmp handling ... Browse Code »

Especially when !CONFIG_HOTPLUG_CPU, avoid needlessy allocating resources for
CPUs that can never become available.

Signed-off-by: Jan Beulich
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Beulich
2007-07-17 00:05:47 +0800
97842216b Allow softlockup to be runtime disabled ... Browse Code »

It's useful sometimes to disable the softlockup checker at boottime.
Especially if it triggers during a distro install.

Signed-off-by: Dave Jones
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Jones
2007-07-17 00:05:47 +0800
acce292c8 user namespace: add the framework ... Browse Code »

Basically, it will allow a process to unshare its user_struct table,
resetting at the same time its own user_struct and all the associated
accounting.

A new root user (uid == 0) is added to the user namespace upon creation.
Such root users have full privileges and it seems that theses privileges
should be controlled through some means (process capabilities ?)

The unshare is not included in this patch.

Changes since [try #4]:
- Updated get_user_ns and put_user_ns to accept NULL, and
get_user_ns to return the namespace.

Changes since [try #3]:
- moved struct user_namespace to files user_namespace.{c,h}

Changes since [try #2]:
- removed struct user_namespace* argument from find_user()

Changes since [try #1]:
- removed struct user_namespace* argument from find_user()
- added a root_user per user namespace

Signed-off-by: Cedric Le Goater
Signed-off-by: Serge E. Hallyn
Acked-by: Pavel Emelianov
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Eric W. Biederman
Cc: Chris Wright
Cc: Stephen Smalley
Cc: James Morris
Cc: Andrew Morgan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cedric Le Goater
2007-07-17 00:05:47 +0800
7d69a1f4a remove CONFIG_UTS_NS and CONFIG_IPC_NS ... Browse Code »

CONFIG_UTS_NS and CONFIG_IPC_NS have very little value as they only
deactivate the unshare of the uts and ipc namespaces and do not improve
performance.

Signed-off-by: Cedric Le Goater
Acked-by: "Serge E. Hallyn"
Cc: Eric W. Biederman
Cc: Herbert Poetzl
Cc: Pavel Emelianov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cedric Le Goater
2007-07-17 00:05:47 +0800
cc1ed7542 init: wait for asynchronously scanned block devices ... Browse Code »

Some buses (e.g. USB and MMC) do their scanning of devices in the
background, causing a race between them and prepare_namespace(). In order
to be able to use these buses without an initrd, we now wait for the device
specified in root= to actually show up.

If the device never shows up than we will hang in an infinite loop. In
order to not mess with setups that reboot on panic, the feature must be
turned on via the command line option "rootwait".

[bunk@stusta.de: root_wait can become static]
Signed-off-by: Pierre Ossman
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pierre Ossman
2007-07-17 00:05:45 +0800
66da57332 Use menuconfig objects II - module menu ... Browse Code »

Change menuconfig objects from "menu, config" into "menuconfig" so that the
user can disable the whole feature without entering its menu first.

Signed-off-by: Jan Engelhardt
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Engelhardt
2007-07-17 00:05:40 +0800
84a01c2f8 slob: sparsemem support ... Browse Code »

Currently slob is disabled if we're using sparsemem, due to an earlier
patch from Goto-san. Slob and static sparsemem work without any trouble as
it is, and the only hiccup is a missing slab_is_available() in the case of
sparsemem extreme. With this, we're rid of the last set of restrictions
for slob usage.

Signed-off-by: Paul Mundt
Acked-by: Pekka Enberg
Acked-by: Matt Mackall
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Mundt
2007-07-17 00:05:36 +0800
18a8bd949 serial: convert early_uart to earlycon for 8250 ... Browse Code »

Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
include/asm-x86_64/serial.h. the serial8250_ports need to be probed late in
serial initializing stage. the console_init=>serial8250_console_init=>
register_console=>serial8250_console_setup will return -ENDEV, and console
ttyS0 can not be enabled at that time. need to wait till uart_add_one_port in
drivers/serial/serial_core.c to call register_console to get console ttyS0.
that is too late.

Make early_uart to use early_param, so uart console can be used earlier. Make
it to be bootconsole with CON_BOOT flag, so can use console handover feature.
and it will switch to corresponding normal serial console automatically.

new command line will be:
console=uart8250,io,0x3f8,9600n8
console=uart8250,mmio,0xff5e0000,115200n8
or
earlycon=uart8250,io,0x3f8,9600n8
earlycon=uart8250,mmio,0xff5e0000,115200n8

it will print in very early stage:
Early serial console at I/O port 0x3f8 (options '9600n8')
console [uart0] enabled
later for console it will print:
console handover: boot [uart0] -> real [ttyS0]

Signed-off-by:
Cc: Andi Kleen
Cc: Bjorn Helgaas
Cc: Russell King
Cc: Gerd Hoffmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yinghai Lu
2007-07-17 00:05:35 +0800

10 Jul, 2007

2 commits

5f5c926e3 block/Kconfig already has its own "menuconfig" so remove these ... Browse Code »

"menu, endmenu" that did not get cleaned up in the block patch
[ http://lkml.org/lkml/2007/4/10/251 ]

Signed-off-by: Jan Engelhardt
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Jan Engelhardt
2007-07-10 19:43:28 +0800
1df21055e sched: add init_idle_bootup_task() ... Browse Code »

add the init_idle_bootup_task() callback to the bootup thread,
unused at the moment. (CFS will use it to switch the scheduling
class of the boot thread to the idle class)

Signed-off-by: Ingo Molnar

Ingo Molnar
2007-07-10 00:51:58 +0800

19 May, 2007

1 commit

92080309d init/main: use __init_refok to fix section mismatch ... Browse Code »

Kill a special case in modpost by introducing the
__init_refok marker.

Signed-off-by: Sam Ravnborg

Sam Ravnborg
2007-05-19 15:11:58 +0800

17 May, 2007

2 commits

9fbf09a09 SLUB: Remove depends on EXPERIMENTAL and !ARCH_USES_SLAB_PAGE_STRUCT ... Browse Code »

No arch sets ARCH_USES_SLAB_PAGE_STRUCT anymore.

Remove the experimental dependency as well since we want to have it as
a real alternative to SLAB.

It all comes down to killing a single line from init/Kconfig.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-17 20:23:03 +0800
afc0cedbe slob: implement RCU freeing ... Browse Code »

The SLOB allocator should implement SLAB_DESTROY_BY_RCU correctly, because
even on UP, RCU freeing semantics are not equivalent to simply freeing
immediately. This also allows SLOB to be used on SMP.

Signed-off-by: Nick Piggin
Acked-by: Matt Mackall
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-05-17 20:23:02 +0800

11 May, 2007

3 commits

e1ad7468c signal/timer/event: eventfd core ... Browse Code »

This is a very simple and light file descriptor, that can be used as event
wait/dispatch by userspace (both wait and dispatch) and by the kernel
(dispatch only). It can be used instead of pipe(2) in all cases where those
would simply be used to signal events. Their kernel overhead is much lower
than pipes, and they do not consume two fds. When used in the kernel, it can
offer an fd-bridge to enable, for example, functionalities like KAIO or
syslets/threadlets to signal to an fd the completion of certain operations.
But more in general, an eventfd can be used by the kernel to signal readiness,
in a POSIX poll/select way, of interfaces that would otherwise be incompatible
with it. The API is:

int eventfd(unsigned int count);

The eventfd API accepts an initial "count" parameter, and returns an eventfd
fd. It supports poll(2) (POLLIN, POLLOUT, POLLERR), read(2) and write(2).

The POLLIN flag is raised when the internal counter is greater than zero.

The POLLOUT flag is raised when at least a value of "1" can be written to the
internal counter.

The POLLERR flag is raised when an overflow in the counter value is detected.

The write(2) operation can never overflow the counter, since it blocks (unless
O_NONBLOCK is set, in which case -EAGAIN is returned).

But the eventfd_signal() function can do it, since it's supposed to not sleep
during its operation.

The read(2) function reads the __u64 counter value, and reset the internal
value to zero. If the value read is equal to (__u64) -1, an overflow happened
on the internal counter (due to 2^64 eventfd_signal() posts that has never
been retired - unlickely, but possible).

The write(2) call writes an __u64 count value, and adds it to the current
counter. The eventfd fd supports O_NONBLOCK also.

On the kernel side, we have:

struct file *eventfd_fget(int fd);
int eventfd_signal(struct file *file, unsigned int n);

The eventfd_fget() should be called to get a struct file* from an eventfd fd
(this is an fget() + check of f_op being an eventfd fops pointer).

The kernel can then call eventfd_signal() every time it wants to post an event
to userspace. The eventfd_signal() function can be called from any context.
An eventfd() simple test and bench is available here:

http://www.xmailserver.org/eventfd-bench.c

This is the eventfd-based version of pipetest-4 (pipe(2) based):

http://www.xmailserver.org/pipetest-4.c

Not that performance matters much in the eventfd case, but eventfd-bench
shows almost as double as performance than pipetest-4.

[akpm@linux-foundation.org: fix i386 build]
[akpm@linux-foundation.org: add sys_eventfd to sys_ni.c]
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davide Libenzi
2007-05-11 23:29:36 +0800
b215e2839 signal/timer/event: timerfd core ... Browse Code »

This patch introduces a new system call for timers events delivered though
file descriptors. This allows timer event to be used with standard POSIX
poll(2), select(2) and read(2). As a consequence of supporting the Linux
f_op->poll subsystem, they can be used with epoll(2) too.

The system call is defined as:

int timerfd(int ufd, int clockid, int flags, const struct itimerspec *utmr);

The "ufd" parameter allows for re-use (re-programming) of an existing timerfd
w/out going through the close/open cycle (same as signalfd). If "ufd" is -1,
s new file descriptor will be created, otherwise the existing "ufd" will be
re-programmed.

The "clockid" parameter is either CLOCK_MONOTONIC or CLOCK_REALTIME. The time
specified in the "utmr->it_value" parameter is the expiry time for the timer.

If the TFD_TIMER_ABSTIME flag is set in "flags", this is an absolute time,
otherwise it's a relative time.

If the time specified in the "utmr->it_interval" is not zero (.tv_sec == 0,
tv_nsec == 0), this is the period at which the following ticks should be
generated.

The "utmr->it_interval" should be set to zero if only one tick is requested.
Setting the "utmr->it_value" to zero will disable the timer, or will create a
timerfd without the timer enabled.

The function returns the new (or same, in case "ufd" is a valid timerfd
descriptor) file, or -1 in case of error.

As stated before, the timerfd file descriptor supports poll(2), select(2) and
epoll(2). When a timer event happened on the timerfd, a POLLIN mask will be
returned.

The read(2) call can be used, and it will return a u32 variable holding the
number of "ticks" that happened on the interface since the last call to
read(2). The read(2) call supportes the O_NONBLOCK flag too, and EAGAIN will
be returned if no ticks happened.

A quick test program, shows timerfd working correctly on my amd64 box:

http://www.xmailserver.org/timerfd-test.c

[akpm@linux-foundation.org: add sys_timerfd to sys_ni.c]
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davide Libenzi
2007-05-11 23:29:36 +0800
fba2afaae signal/timer/event: signalfd core ... Browse Code »

This patch series implements the new signalfd() system call.

I took part of the original Linus code (and you know how badly it can be
broken :), and I added even more breakage ;) Signals are fetched from the same
signal queue used by the process, so signalfd will compete with standard
kernel delivery in dequeue_signal(). If you want to reliably fetch signals on
the signalfd file, you need to block them with sigprocmask(SIG_BLOCK). This
seems to be working fine on my Dual Opteron machine. I made a quick test
program for it:

http://www.xmailserver.org/signafd-test.c

The signalfd() system call implements signal delivery into a file descriptor
receiver. The signalfd file descriptor if created with the following API:

int signalfd(int ufd, const sigset_t *mask, size_t masksize);

The "ufd" parameter allows to change an existing signalfd sigmask, w/out going
to close/create cycle (Linus idea). Use "ufd" == -1 if you want a brand new
signalfd file.

The "mask" allows to specify the signal mask of signals that we are interested
in. The "masksize" parameter is the size of "mask".

The signalfd fd supports the poll(2) and read(2) system calls. The poll(2)
will return POLLIN when signals are available to be dequeued. As a direct
consequence of supporting the Linux poll subsystem, the signalfd fd can use
used together with epoll(2) too.

The read(2) system call will return a "struct signalfd_siginfo" structure in
the userspace supplied buffer. The return value is the number of bytes copied
in the supplied buffer, or -1 in case of error. The read(2) call can also
return 0, in case the sighand structure to which the signalfd was attached,
has been orphaned. The O_NONBLOCK flag is also supported, and read(2) will
return -EAGAIN in case no signal is available.

If the size of the buffer passed to read(2) is lower than sizeof(struct
signalfd_siginfo), -EINVAL is returned. A read from the signalfd can also
return -ERESTARTSYS in case a signal hits the process. The format of the
struct signalfd_siginfo is, and the valid fields depends of the (->code &
__SI_MASK) value, in the same way a struct siginfo would:

struct signalfd_siginfo {
__u32 signo; /* si_signo */
__s32 err; /* si_errno */
__s32 code; /* si_code */
__u32 pid; /* si_pid */
__u32 uid; /* si_uid */
__s32 fd; /* si_fd */
__u32 tid; /* si_fd */
__u32 band; /* si_band */
__u32 overrun; /* si_overrun */
__u32 trapno; /* si_trapno */
__s32 status; /* si_status */
__s32 svint; /* si_int */
__u64 svptr; /* si_ptr */
__u64 utime; /* si_utime */
__u64 stime; /* si_stime */
__u64 addr; /* si_addr */
};

[akpm@linux-foundation.org: fix signalfd_copyinfo() on i386]
Signed-off-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davide Libenzi
2007-05-11 23:29:36 +0800