11 Oct, 2008
1 commit
-
Move cio's private simple udelay function to lib/delay.c and turn it
into something much more readable. So we have all implementations
at one place.Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
04 Oct, 2008
1 commit
-
This fixes a regression that came with 934b2857cc576ae53c92a66e63fce7ddcfa74691
("[S390] nohz/sclp: disable timer on synchronous waits.").
If udelay() gets called from a disabled context it sets the clock comparator
to a value where it expects the next interrupt. When the interrupt happens
the clock comparator gets not reset and therefore the interrupt condition
doesn't get cleared. The result is an endless timer interrupt loop.In addition this patch fixes also the following:
rcutorture reveals that our __udelay implementation is still buggy,
since it might schedule tasklets, but prevents their execution:NOHZ: local_softirq_pending 42
NOHZ: local_softirq_pending 02
NOHZ: local_softirq_pending 142
NOHZ: local_softirq_pending 02To fix this we make sure that only the clock comparator interrupt
is enabled when the enabled wait psw is loaded.
Also no code gets called anymore which might schedule tasklets.Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
01 Aug, 2008
1 commit
-
sclp_sync_wait wait synchronously for an sclp interrupt and disables
timer interrupts. However on the irq enter paths there is an extra
check if a timer interrupt would be due and calls the timer callback.
This would schedule softirqs in the wrong context.
So introduce local_tick_enable/disable which prevents this.Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
30 Apr, 2008
2 commits
-
arch/s390/lib/uaccess_mvcos.c:166:
warning: 'strnlen_user_mvcos' defined but not used
arch/s390/lib/uaccess_mvcos.c:186:
warning: 'strncpy_from_user_mvcos' defined but not usedSigned-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky -
Signed-off-by: Mathieu Desnoyers
CC: Sam Ravnborg
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
17 Apr, 2008
2 commits
-
The current uaccess page table walk code assumes at a few places that
any access is a user space access. This is not correct if somebody
has issued a set_fs(KERNEL_DS) in advance.
Add code which checks which address space we are in and with this make
sure we access the correct address space. This way we get also rid of
the dirty
if (!currrent-mm)
return -EFAULT;
hack in futex_atomic_cmpxchg_pt.Signed-off-by: Martin Schwidefsky
Signed-off-by: Heiko Carstens -
This way we get rid of s390's NO_IDLE_HZ and use the generic dynticks
variant instead. In addition we get high resolution timers for free.Signed-off-by: Martin Schwidefsky
Signed-off-by: Heiko Carstens
21 Mar, 2008
1 commit
-
a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and
robust functionality" introduces a test wether futex in atomic stuff
works or not.
It does that by writing to address 0 of the kernel address space. This
will crash on older machines where addressing mode switching is enabled
but where the mvcos instruction is not available. Page table walking is
done by hand and therefore the code tries to access current->mm which
is NULL.
Therefore add an extra check, so we survive the early test.Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
19 Feb, 2008
1 commit
-
Add missing exception table entry so that the kernel can handle
proctection exceptions as well on the cs instruction. Currently only
specification exceptions are handled correctly.
The missing entry allows user space to crash the kernel.Cc: stable
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
26 Jan, 2008
2 commits
-
In s390's spin_lock_irqsave, interrupts remain disabled while
spinning. In other architectures like x86 and powerpc, interrupts are
re-enabled while spinning if IRQ is not masked before spin_lock_irqsave
is called.The following patch re-enables interrupts through local_irq_restore
while spinning for a lock acquisition.
This can improve system response.[heiko.carstens@de.ibm.com: removed saving of pc]
Signed-off-by: Hisashi Hifumi
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky -
Used to contain the address of the holder of the lock. But since the
spinlock code is not inlined anymore all locks contain the same address
anyway. And since in addtition nobody complained about that for ages
its obviously unused. So remove it.Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
22 Oct, 2007
2 commits
-
Get independent from asm-generic/4level-fixup.h
Signed-off-by: Martin Schwidefsky
-
Define and use follow_table inline in uaccess_pt.c to simplify
the code.Signed-off-by: Martin Schwidefsky
20 Oct, 2007
1 commit
-
is_init() is an ambiguous name for the pid==1 check. Split it into
is_global_init() and is_container_init().A cgroup init has it's tsk->pid == 1.
A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns. But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.Changelog:
2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
global init (/sbin/init) process. This would improve performance
and remove dependence on the task_pid().2.6.21-mm2-pidns2:
- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
ppc,avr32}/traps.c for the _exception() call to is_global_init().
This way, we kill only the cgroup if the cgroup's init has a
bug rather than force a kernel panic.[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn
Signed-off-by: Sukadev Bhattiprolu
Acked-by: Pavel Emelianov
Cc: Eric W. Biederman
Cc: Cedric Le Goater
Cc: Dave Hansen
Cc: Herbert Poetzel
Cc: Kirill Korotaev
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
20 Jul, 2007
1 commit
-
This patch completes Linus's wish that the fault return codes be made into
bit flags, which I agree makes everything nicer. This requires requires
all handle_mm_fault callers to be modified (possibly the modifications
should go further and do things like fault accounting in handle_mm_fault --
however that would be for another patch).[akpm@linux-foundation.org: fix alpha build]
[akpm@linux-foundation.org: fix s390 build]
[akpm@linux-foundation.org: fix sparc build]
[akpm@linux-foundation.org: fix sparc64 build]
[akpm@linux-foundation.org: fix ia64 build]
Signed-off-by: Nick Piggin
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Russell King
Cc: Ian Molton
Cc: Bryan Wu
Cc: Mikael Starvik
Cc: David Howells
Cc: Yoshinori Sato
Cc: "Luck, Tony"
Cc: Hirokazu Takata
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Cc: Greg Ungerer
Cc: Matthew Wilcox
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: Heiko Carstens
Cc: Martin Schwidefsky
Cc: Paul Mundt
Cc: Kazumoto Kojima
Cc: Richard Curnow
Cc: William Lee Irwin III
Cc: "David S. Miller"
Cc: Jeff Dike
Cc: Paolo 'Blaisorblade' Giarrusso
Cc: Miles Bader
Cc: Chris Zankel
Acked-by: Kyle McMartin
Acked-by: Haavard Skinnemoen
Acked-by: Ralf Baechle
Acked-by: Andi Kleen
Signed-off-by: Andrew Morton
[ Still apparently needs some ARM and PPC loving - Linus ]
Signed-off-by: Linus Torvalds
10 Jul, 2007
1 commit
-
The bogomips calculation triggered via reading from /proc/cpuinfo
can return incorrect values if the qrnnd assembly is called with a
pointer in %r2 with any of the upper 32 bits set.
Fix this by using 64 bit division / remainder operation provided by
gcc instead of calling the assembly.Signed-off-by: Martin Schwidefsky
26 Apr, 2007
1 commit
-
Allow s390 to properly override the generic
__div64_32() implementation by:1) Using obj-y for div64.o in s390's makefile instead
of lib-y2) Adding the weak attribute to the generic implementation.
Signed-off-by: David S. Miller
21 Feb, 2007
1 commit
-
The new delay implementation uses the clock comparator and an external
interrupt even if it is called disabled for interrupts. To do this
all external interrupt source except clock comparator are switched of
before enabling external interrupts. The external interrupt at the
end of the delay period may not execute softirqs or we can end up in a
dead-lock.Signed-off-by: Martin Schwidefsky
06 Feb, 2007
5 commits
-
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky -
Preset the bogomips number to the cpu capacity value reported by
store system information in SYSIB 1.2.2. This value is constant
for a particular machine model and can be used to determine
relative performance differences between machines.Signed-off-by: Martin Schwidefsky
-
This patch adds support for clock synchronization to an external time
reference (ETR). The external time reference sends an oscillator
signal and a synchronization signal every 2^20 microseconds to keep
the TOD clocks of all connected servers in sync. For availability
two ETR units can be connected to a machine. If the clock deviates
for more than the sync-check tolerance all cpus get a machine check
that indicates that the clock is out of sync. For the lovely details
how to get the clock back in sync see the code below.Signed-off-by: Martin Schwidefsky
-
This provides a noexec protection on s390 hardware. Our hardware does
not have any bits left in the pte for a hw noexec bit, so this is a
different approach using shadow page tables and a special addressing
mode that allows separate address spaces for code and data.As a special feature of our "secondary-space" addressing mode, separate
page tables can be specified for the translation of data addresses
(storage operands) and instruction addresses. The shadow page table is
used for the instruction addresses and the standard page table for the
data addresses.
The shadow page table is linked to the standard page table by a pointer
in page->lru.next of the struct page corresponding to the page that
contains the standard page table (since page->private is not really
private with the pte_lock and the page table pages are not in the LRU
list).
Depending on the software bits of a pte, it is either inserted into
both page tables or just into the standard (data) page table. Pages of
a vma that does not have the VM_EXEC bit set get mapped only in the
data address space. Any try to execute code on such a page will cause a
page translation exception. The standard reaction to this is a SIGSEGV
with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn)
and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the
kernel to the signal stack frame. Unfortunately, the signal return
mechanism cannot be modified to use an SA_RESTORER because the
exception unwinding code depends on the system call opcode stored
behind the signal stack frame.This feature requires that user space is executed in secondary-space
mode and the kernel in home-space mode, which means that the addressing
modes need to be switched and that the noexec protection only works
for user space.
After switching the addressing modes, we cannot use the mvcp/mvcs
instructions anymore to copy between kernel and user space. A new
mvcos instruction has been added to the z9 EC/BC hardware which allows
to copy between arbitrary address spaces, but on older hardware the
page tables need to be walked manually.Signed-off-by: Gerald Schaefer
Signed-off-by: Martin Schwidefsky -
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
09 Jan, 2007
1 commit
-
There are several places in the futex code where a spin_lock is held
and still uaccesses happen. Deadlocks are avoided by increasing the
preempt count. The pagefault handler will then not take any locks
but will immediately search the fixup tables.Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
08 Dec, 2006
2 commits
-
Doesn't seem to be a good idea to duplicate code :)
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky -
Introduce pagefault_{disable,enable}() and use these where previously we did
manual preempt increments/decrements to make the pagefault handler do the
atomic thing.Currently they still rely on the increased preempt count, but do not rely on
the disabled preemption, this might go away in the future.(NOTE: the extra barrier() in pagefault_disable might fix some holes on
machines which have too many registers for their own good)[heiko.carstens@de.ibm.com: s390 fix]
Signed-off-by: Peter Zijlstra
Acked-by: Nick Piggin
Cc: Martin Schwidefsky
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
04 Dec, 2006
1 commit
-
Use a wrapper for copy_to/from_user to chose the best usercopy method.
The mvcos instruction is better for sizes greater than 256 bytes, if
mvcos is not available a page table walk is better for sizes greater
than 1024 bytes. Also removed the redundant copy_to/from_user_std_small
functions.Signed-off-by: Gerald Schaefer
Signed-off-by: Martin Schwidefsky
01 Oct, 2006
1 commit
-
Use the new diagnose 0x9c in the spinlock implementation for s390. It
yields the remaining timeslice of the virtual cpu that tries to acquire a
lock to the virtual cpu that is the current holder of the lock.Signed-off-by: Martin Schwidefsky
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Sep, 2006
3 commits
-
Major cleanup of all s390 inline assemblies. They now have a common
coding style. Quite a few have been shortened, mainly by using register
asm variables. Use of the EX_TABLE macro helps as well. The atomic ops,
bit ops and locking inlines new use the Q-constraint if a newer gcc
is used. That results in slightly better code.Thanks to Christian Borntraeger for proof reading the changes.
Signed-off-by: Martin Schwidefsky
-
A user space program can read uninitialised kernel memory
by appending to a file from a bad address and then reading
the result back. The cause is the copy_from_user function
that does not clear the remaining bytes of the kernel
buffer after it got a fault on the user space address.Signed-off-by: Martin Schwidefsky
-
The clocksource infrastructure introduced with commit
ad596171ed635c51a9eef829187af100cbf8dcf7 broke 31 bit s390.
The reason is that the do_div() primitive for 31 bit always
had a restriction: it could only divide an unsigned 64 bit
integer by an unsigned 31 bit integer. The clocksource code
now uses do_div() with a base value that has the most
significant bit set. The result is that clock->cycle_interval
has a funny value which causes the linux time to jump around
like mad.
The solution is "obvious": implement a proper __div64_32
function for 31 bit s390.Signed-off-by: Martin Schwidefsky
20 Sep, 2006
2 commits
-
This introduces new user-copy operations which are optimized for
copying more than 256 Bytes on new hardware.Signed-off-by: Gerald Schaefer
Signed-off-by: Martin Schwidefsky -
Introduces a struct uaccess_ops which allows setting user-copy
operations at run-time.Signed-off-by: Gerald Schaefer
Signed-off-by: Martin Schwidefsky
30 Aug, 2006
1 commit
-
The copy_in_user primitive does not work as advertised. If the source
and target area are available copy_in_user copies one byte too much.
If one of the memory areas is not available it does not copy as much
data as it can, but up to 257 bytes less.Signed-off-by: Martin Schwidefsky
12 Jul, 2006
1 commit
-
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky
01 Jul, 2006
1 commit
-
Signed-off-by: Jörn Engel
Signed-off-by: Adrian Bunk
10 Mar, 2006
1 commit
-
Currently the code tries up to spin_retry times to grab a lock using the cs
instruction. The cs instruction has exclusive access to a memory region
and therefore invalidates the appropiate cache line of all other cpus. If
there is contention on a lock this leads to cache line trashing. This can
be avoided if we first check wether a cs instruction is likely to succeed
before the instruction gets actually executed.Signed-off-by: Christian Ehrhardt
Signed-off-by: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Mar, 2006
1 commit
-
strnlen_user is supposed to return then length count + 1 if no terminating \0
is found, and it should return 0 on exception. Found by David Howells
.Signed-off-by: Gerald Schaefer
Signed-off-by: Heiko Carstens
Acked-By: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Feb, 2006
1 commit
-
Fix __delay implementation. Called with an argument "1" or "0" it would
loop nearly forever (since (1/2)-1 = 0xffffffff).Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Feb, 2006
1 commit
-
- Remove all CVS generated information like e.g. revision IDs from
drivers/s390 and include/asm-s390 (none present in arch/s390).- Add newline at end of arch/s390/lib/Makefile to avoid diff message.
Acked-by: Andreas Herrmann
Acked-by: Frank Pavlic
Signed-off-by: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds