06 Dec, 2011
1 commit
-
do_notify_resume() gets called with interrupts disabled on x86_32. This
is different from the x86_64 behavior, where interrupts are enabled at
the time.Queries on lkml on this issue hasn't yielded any clear answer. Lets make
x86_32 behave the same as x86_64, unless there is a real reason to
maintain status quo.Please refer https://lkml.org/lkml/2011/9/27/130 for more
details.A similar change was suggested in ARM:
https://lkml.org/lkml/2011/8/25/231
My 32-bit machine works fine (tm) with this patch.
Signed-off-by: Srikar Dronamraju
Acked-by: Masami Hiramatsu
Signed-off-by: Peter Zijlstra
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/20111025141812.GA21225@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar
26 Aug, 2011
1 commit
-
entry_32.S contained a hardcoded alternative instruction entry, and the
format changed in commit 59e97e4d6fbc ("x86: Make alternative
instruction pointers relative").Replace the hardcoded entry with the altinstruction_entry macro. This
fixes the 32-bit boot with CONFIG_X86_INVD_BUG=y.Reported-and-tested-by: Arnaud Lacombe
Signed-off-by: Andy Lutomirski
Cc: Peter Anvin
Cc: Ingo Molnar
Signed-off-by: Linus Torvalds
17 Mar, 2011
1 commit
-
…rnel/git/tip/linux-2.6-tip
* 'x86-trampoline-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix binutils-2.21 symbol related build failures
x86-64, trampoline: Remove unused variable
x86, reboot: Fix the use of passed arguments in 32-bit BIOS reboot
x86, reboot: Move the real-mode reboot code to an assembly file
x86: Make the GDT_ENTRY() macro in <asm/segment.h> safe for assembly
x86, trampoline: Use the unified trampoline setup for ACPI wakeup
x86, trampoline: Common infrastructure for low memory trampolinesFix up trivial conflicts in arch/x86/kernel/Makefile
16 Mar, 2011
1 commit
-
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, binutils, xen: Fix another wrong size directive
x86: Remove dead config option X86_CPU
x86: Really print supported CPUs if PROCESSOR_SELECT=y
x86: Fix a bogus unwind annotation in lib/semaphore_32.S
um, x86-64: Fix UML build after adding CFI annotations to lib/rwsem_64.S
x86: Remove unused bits from lib/thunk_*.S
x86: Use {push,pop}_cfi in more places
x86-64: Add CFI annotations to lib/rwsem_64.S
x86, asm: Cleanup unnecssary macros in asm-offsets.c
x86, system.h: Drop unused __SAVE/__RESTORE macros
x86: Use bitmap library functions
x86: Partly unify asm-offsets_{32,64}.c
x86: Reduce back the alignment of the per-CPU data section
09 Mar, 2011
2 commits
-
New binutils version 2.21.0.20110302-1 started checking that the symbol
parameter to the .size directive matches the entry name's
symbol parameter, unearthing two mismatches:AS arch/x86/kernel/acpi/wakeup_rm.o
arch/x86/kernel/acpi/wakeup_rm.S: Assembler messages:
arch/x86/kernel/acpi/wakeup_rm.S:12: Error: .size expression with symbol `wakeup_code_start' does not evaluate to a constantarch/x86/kernel/entry_32.S: Assembler messages:
arch/x86/kernel/entry_32.S:1421: Error: .size expression with
symbol `apf_page_fault' does not evaluate to a constantThe problem was discovered while using Debian's binutils
(2.21.0.20110302-1) and experimenting with binutils from
upstream.Thanks Alexander and H.J. for the vital help.
Signed-off-by: Sedat Dilek
Cc: Alexander van Heukelum
Cc: H.J. Lu
Cc: Len Brown
Cc: Pavel Machek
Cc: Rafael J. Wysocki
LKML-Reference:
Signed-off-by: Ingo Molnar -
Put x86 entry code into a separate link section: .entry.text.
Separating the entry text section seems to have performance
benefits - caused by more efficient instruction cache usage.Running hackbench with perf stat --repeat showed that the change
compresses the icache footprint. The icache load miss rate went
down by about 15%:before patch:
19417627 L1-icache-load-misses ( +- 0.147% )after patch:
16490788 L1-icache-load-misses ( +- 0.180% )The motivation of the patch was to fix a particular kprobes
bug that relates to the entry text section, the performance
advantage was discovered accidentally.Whole perf output follows:
- results for current tip tree:
Performance counter stats for './hackbench/hackbench 10' (500 runs):
19417627 L1-icache-load-misses ( +- 0.147% )
2676914223 instructions # 0.497 IPC ( +- 0.079% )
5389516026 cycles ( +- 0.144% )0.206267711 seconds time elapsed ( +- 0.138% )
- results for current tip tree with the patch applied:
Performance counter stats for './hackbench/hackbench 10' (500 runs):
16490788 L1-icache-load-misses ( +- 0.180% )
2717734941 instructions # 0.502 IPC ( +- 0.079% )
5414756975 cycles ( +- 0.148% )0.206747566 seconds time elapsed ( +- 0.137% )
Signed-off-by: Jiri Olsa
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Nick Piggin
Cc: Eric Dumazet
Cc: masami.hiramatsu.pt@hitachi.com
Cc: ananth@in.ibm.com
Cc: davem@davemloft.net
Cc: 2nddept-manager@sdl.hitachi.co.jp
LKML-Reference:
Signed-off-by: Ingo Molnar
01 Mar, 2011
1 commit
-
Cleaning up and shortening code...
Signed-off-by: Jan Beulich
Cc: Alexander van Heukelum
LKML-Reference:
Signed-off-by: Ingo Molnar
26 Feb, 2011
1 commit
-
PAGE_SIZE_asm, PAGE_SHIFT_asm, THREAD_SIZE_asm can be safely removed from
asm-offsets.c, and be replaced by their non-'_asm' counterparts in the code
that uses them, since the _AC macro defined in include/linux/const.h makes
PAGE_SIZE/PAGE_SHIFT/THREAD_SIZE work with as.Signed-off-by: Stratos Psomadakis
LKML-Reference:
Signed-off-by: H. Peter Anvin
12 Jan, 2011
1 commit
-
When async PF capability is detected hook up special page fault handler
that will handle async page fault events and bypass other page faults to
regular page fault handler. Also add async PF handling to nested SVM
emulation. Async PF always generates exit to L1 where vcpu thread will
be scheduled out until page is available.Acked-by: Rik van Riel
Signed-off-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti
18 Nov, 2010
1 commit
-
Add parentheses around one pushl_cfi argument.
Commit df5d1874 "x86: Use {push,pop}{l,q}_cfi in more places"
caused GNU assembler 2.15 (Debian Sarge) to fail. It is still
failing as of commit 07bd8516 "x86, asm: Restore parentheses
around one pushl_cfi argument". This patch solves build failure
with GNU assembler 2.15.Signed-off-by: Tetsuo Handa
Acked-by: Jan Beulich
Cc: heukelum@fastmail.fm
Cc: hpa@linux.intel.com
LKML-Reference:
Signed-off-by: Ingo Molnar
22 Oct, 2010
1 commit
-
These were (intentionally) stripped by "fix CFI macro
invocations to deal with shortcomings in gas" to expose problems
with unexpected splitting of arguments by older gas also on
newer versions, but as it turns out there is at least one distro
(Ubuntu 6.06) where even not having *any* spaces in a macro
argument doesn't reliably prevent splitting into multiple
arguments.Signed-off-by: Jan Beulich
Acked-by: Alexander van Heukelum
LKML-Reference:
Signed-off-by: Ingo Molnar
20 Oct, 2010
1 commit
-
gas prior to (perhaps) 2.16.90 has problems with passing non-
parenthesized expressions containing spaces to macros. Spaces, however,
get inserted by cpp between any macro expanding to a number and a
subsequent + or -. For the +, current x86 gas then removes the space
again (future gas may not do so), but for the - the space gets retained
and is then considered a separator between macro arguments.Fix the respective definitions for both the - and + cases, so that they
neither contain spaces nor make cpp insert any (the latter by adding
seemingly redundant parentheses).Signed-off-by: Jan Beulich
LKML-Reference:
Cc: Alexander van Heukelum
Signed-off-by: H. Peter Anvin
03 Sep, 2010
2 commits
-
... plus additionally introduce {push,pop}f{l,q}_cfi. All in the
hope that the code becomes better readable this way (it gets
quite a bit smaller in any case).Signed-off-by: Jan Beulich
Acked-by: Alexander van Heukelum
LKML-Reference:
Signed-off-by: Ingo Molnar -
When these stubs are actual functions (i.e. having a return
instruction) and have stack manipulation instructions in them,
they should also be annotated to allow unwinding through them.Signed-off-by: Jan Beulich
Acked-by: Alexander van Heukelum
LKML-Reference:
Signed-off-by: Ingo Molnar
07 Aug, 2010
2 commits
-
…kernel/git/tip/linux-2.6-tip
* 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, alternatives: BUG on encountering an invalid CPU feature number
x86, alternatives: Fix one more open-coded 8-bit alternative number
x86, alternatives: Use 16-bit numbers for cpufeature index -
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
um, x86: Cast to (u64 *) inside set_64bit()
x86-32, asm: Directly access per-cpu GDT
x86-64, asm: Directly access per-cpu IST
x86, asm: Merge cmpxchg_486_u64() and cmpxchg8b_emu()
x86, asm: Move cmpxchg emulation code to arch/x86/lib
x86, asm: Clean up and simplify
x86, asm: Clean up and simplify set_64bit()
x86: Add memory modify constraints to xchg() and cmpxchg()
x86-64: Simplify loading initial_gs
x86: Use symbolic MSR names
x86: Remove redundant K6 MSRs
02 Aug, 2010
1 commit
-
Use a direct per-cpu reference for the GDT instead of using a scratch
register.Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin
23 Jul, 2010
1 commit
-
Set the callback to receive evtchns from Xen, using the
callback vector delivery mechanism.The traditional way for receiving event channel notifications from Xen
is via the interrupts from the platform PCI device.
The callback vector is a newer alternative that allow us to receive
notifications on any vcpu and doesn't need any PCI support: we allocate
a vector exclusively to receive events, in the vector handler we don't
need to interact with the vlapic, therefore we avoid a VMEXIT.Signed-off-by: Stefano Stabellini
Signed-off-by: Sheng Yang
Signed-off-by: Jeremy Fitzhardinge
08 Jul, 2010
1 commit
-
We already have cpufeature indicies above 255, so use a 16-bit number
for the alternatives index. This consumes a padding field and so
doesn't add any size, but it means that abusing the padding field to
create assembly errors on overflow no longer works. We can retain the
test simply by redirecting it to the .discard section, however.[ v3: updated to include open-coded locations ]
Signed-off-by: H. Peter Anvin
LKML-Reference:
Signed-off-by: H. Peter Anvin
04 May, 2010
1 commit
-
The cache flush denied error is an erratum on some AMD 486 clones. If an invd
instruction is executed in userspace, the processor calls exception 19 (13 hex)
instead of #GP (13 decimal). On cpus where XMM is not supported, redirect
exception 19 to do_general_protection(). Also, remove die_if_kernel(), since
this was the last user.Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin
11 Dec, 2009
1 commit
-
The arg should be in %eax, but that is clobbered by the return value
of clone. The function pointer can be in any register. Also, don't
push args onto the stack, since regparm(3) is the normal calling
convention now.Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin
10 Dec, 2009
7 commits
-
In the PTREGSCALL1 and 2 macros, we can trivially avoid an unnecessary
pipeline serialization, so do so.In PTREGSCALLS3 this is much less clear-cut since we have to push a
new value to the stack. Leave it alone for now assuming it is as good
as it is going to be; may want to check on Atom or another in-order
x86 to see if we can do better.Signed-off-by: H. Peter Anvin
Cc: Brian Gerst
LKML-Reference: -
Change 32-bit sys_clone to new PTREGSCALL stub, and merge with 64-bit.
Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin -
Convert these to new PTREGSCALL stubs.
Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin -
Change 32-bit sys_sigaltstack to PTREGSCALL2, and merge with 64-bit.
Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin -
Change 32-bit sys_execve to PTREGSCALL3, and merge with 64-bit.
Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin -
Change 32-bit sys_iopl to PTREGSCALL1, and merge with 64-bit.
Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin -
Add new stubs which add the pt_regs pointer as the last arg, matching
64-bit. This will allow these syscalls to be easily merged.Signed-off-by: Brian Gerst
LKML-Reference:
Signed-off-by: H. Peter Anvin
23 Oct, 2009
1 commit
-
Conflicts:
tools/perf/MakefileMerge reason:
- fix the conflict
- pick up the pr_*() infrastructure to queue up dependent patchSigned-off-by: Ingo Molnar
14 Oct, 2009
1 commit
-
The function graph tracer replaces the return address with a hook
to trace the exit of the function call. This hook will finish by
returning to the real location the function should return to.But the current implementation uses a ret to jump to the real
return location. This causes a imbalance between calls and ret.
That is the original function does a call, the ret goes to the
handler and then the handler does a ret without a matching call.Although the function graph tracer itself still breaks the branch
predictor by replacing the original ret, by using a second ret and
causing an imbalance, it breaks the predictor even more.This patch replaces the ret with a jmp to keep the calls and ret
balanced. I tested this on one box and it showed a 1.7% increase in
performance. Another box only showed a small 0.3% increase. But no
box that I tested this on showed a decrease in performance by
making this change.Signed-off-by: Steven Rostedt
Acked-by: Mathieu Desnoyers
Cc: Frederic Weisbecker
LKML-Reference:
Signed-off-by: Ingo Molnar
11 Sep, 2009
1 commit
-
Move irq-exit functions to .kprobes.text section to protect against
kprobes recursion.When I ran kprobe stress test on x86-32, I found below symbols
cause unrecoverable recursive probing:ret_from_exception
ret_from_intr
check_userspace
restore_all
restore_all_notrace
restore_nocheck
irq_returnAnd also, I found some interrupt/exception entry points that
cause similar problems.This patch moves those symbols (including their container functions)
to .kprobes.text section to prevent any kprobes probing.Signed-off-by: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: Ananth N Mavinakayanahalli
Cc: Jim Keniston
Cc: Ingo Molnar
LKML-Reference:
Signed-off-by: Frederic Weisbecker
21 Jun, 2009
1 commit
-
…nel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
tracing/urgent: warn in case of ftrace_start_up inbalance
tracing/urgent: fix unbalanced ftrace_start_up
function-graph: add stack frame test
function-graph: disable when both x86_32 and optimize for size are configured
ring-buffer: have benchmark test print to trace buffer
ring-buffer: do not grab locks in nmi
ring-buffer: add locks around rb_per_cpu_empty
ring-buffer: check for less than two in size allocation
ring-buffer: remove useless compile check for buffer_page size
ring-buffer: remove useless warn on check
ring-buffer: use BUF_PAGE_HDR_SIZE in calculating index
tracing: update sample event documentation
tracing/filters: fix race between filter setting and module unload
tracing/filters: free filter_string in destroy_preds()
ring-buffer: use commit counters for commit pointer accounting
ring-buffer: remove unused variable
ring-buffer: have benchmark test handle discarded events
ring-buffer: prevent adding write in discarded area
tracing/filters: strloc should be unsigned short
tracing/filters: operand can be negative
...Fix up kmemcheck-induced conflict in kernel/trace/ring_buffer.c manually
19 Jun, 2009
1 commit
-
In case gcc does something funny with the stack frames, or the return
from function code, we would like to detect that.An arch may implement passing of a variable that is unique to the
function and can be saved on entering a function and can be tested
when exiting the function. Usually the frame pointer can be used for
this purpose.This patch also implements this for x86. Where it passes in the stack
frame of the parent function, and will test that frame on exit.There was a case in x86_32 with optimize for size (-Os) where, for a
few functions, gcc would align the stack frame and place a copy of the
return address into it. The function graph tracer modified the copy and
not the actual return address. On return from the funtion, it did not go
to the tracer hook, but returned to the parent. This broke the function
graph tracer, because the return of the parent (where gcc did not do
this funky manipulation) returned to the location that the child function
was suppose to. This caused strange kernel crashes.This test detected the problem and pointed out where the issue was.
This modifies the parameters of one of the functions that the arch
specific code calls, so it includes changes to arch code to accommodate
the new prototype.Note, I notice that the parsic arch implements its own push_return_trace.
This is now a generic function and the ftrace_push_return_trace should be
used instead. This patch does not touch that code.Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Heiko Carstens
Cc: Martin Schwidefsky
Cc: Frederic Weisbecker
Cc: Helge Deller
Cc: Kyle McMartin
Signed-off-by: Steven Rostedt
18 Jun, 2009
3 commits
-
asm/desc.h is included in three assembly files, but the only macro
it defines, GET_DESC_BASE, is never used. This patch removes the
includes, removes the macro GET_DESC_BASE and the ASSEMBLY guard
from asm/desc.h.Signed-off-by: Alexander van Heukelum
Signed-off-by: H. Peter Anvin -
The espfix code triggers if we have a protected mode userspace
application with a 16-bit stack. On returning to userspace, with iret,
the CPU doesn't restore the high word of the stack pointer. This is an
"official" bug, and the work-around used in the kernel is to temporarily
switch to a 32-bit stack segment/pointer pair where the high word of the
pointer is equal to the high word of the userspace stackpointer.The current implementation uses THREAD_SIZE to determine the cut-off,
but there is no good reason not to use the more natural 64kb... However,
implementing this by simply substituting THREAD_SIZE with 65536 in
patch_espfix_desc crashed the test application. patch_espfix_desc tries
to do what is described above, but gets it subtly wrong if the userspace
stack pointer is just below a multiple of THREAD_SIZE: an overflow
occurs to bit 13... With a bit of luck, when the kernelspace
stackpointer is just below a 64kb-boundary, the overflow then ripples
trough to bit 16 and userspace will see its stack pointer changed by
65536.This patch moves all espfix code into entry_32.S. Selecting a 16-bit
cut-off simplifies the code. The game with changing the limit dynamically
is removed too. It complicates matters and I see no value in it. Changing
only the top 16-bit word of ESP is one instruction and it also implies
that only two bytes of the ESPFIX GDT entry need to be changed and this
can be implemented in just a handful simple to understand instructions.
As a side effect, the operation to compute the original ESP from the
ESPFIX ESP and the GDT entry simplifies a bit too, and the remaining
three instructions have been expanded inline in entry_32.S.impact: can now reliably run userspace with ESP=xxxxfffc on 16-bit
stack segmentSigned-off-by: Alexander van Heukelum
Acked-by: Stas Sergeev
Signed-off-by: H. Peter Anvin -
Returning to a task with a 16-bit stack requires special care: the iret
instruction does not restore the high word of esp in that case. The
espfix code fixes this, but currently is not invoked on NMIs. This means
that a running task gets the upper word of esp clobbered due intervening
NMIs. To reproduce, compile and run the following program with the nmi
watchdog enabled (nmi_watchdog=2 on the command line). Using gdb you can
see that the high bits of esp contain garbage, while the low bits are
still correct.This patch puts the espfix code back into the NMI code path.
The patch is slightly complicated due to the irqtrace infrastructure not
being NMI-safe. The NMI return path cannot call TRACE_IRQS_IRET.
Otherwise, the tail of the normal iret-code is correct for the nmi code
path too. To be able to share this code-path, the TRACE_IRQS_IRET was
move up a bit. The espfix code exists after the TRACE_IRQS_IRET, but
this code explicitly disables interrupts. This short interrupts-off
section is now not traced anymore. The return-to-kernel path now always
includes the preliminary test to decide if the espfix code should be
called. This is never the case, but doing it this way keeps the patch as
simple as possible and the few extra instructions should not affect
timing in any significant way.#define _GNU_SOURCE
#include
#include
#include
#include
#include
#includeint modify_ldt(int func, void *ptr, unsigned long bytecount)
{
return syscall(SYS_modify_ldt, func, ptr, bytecount);
}/* this is assumed to be usable */
#define SEGBASEADDR 0x10000
#define SEGLIMIT 0x20000/* 16-bit segment */
struct user_desc desc = {
.entry_number = 0,
.base_addr = SEGBASEADDR,
.limit = SEGLIMIT,
.seg_32bit = 0,
.contents = 0, /* ??? */
.read_exec_only = 0,
.limit_in_pages = 0,
.seg_not_present = 0,
.useable = 1
};int main(void)
{
setvbuf(stdout, NULL, _IONBF, 0);/* map a 64 kb segment */
char *pointer = mmap((void *)SEGBASEADDR, SEGLIMIT+1,
PROT_EXEC|PROT_READ|PROT_WRITE,
MAP_SHARED|MAP_ANONYMOUS, -1, 0);
if (pointer == NULL) {
printf("could not map space\n");
return 0;
}/* write ldt, new mode */
int err = modify_ldt(0x11, &desc, sizeof(desc));
if (err) {
printf("error modifying ldt: %i\n", err);
return 0;
}for (int i=0; i
Acked-by: Stas Sergeev
Signed-off-by: H. Peter Anvin
14 Mar, 2009
1 commit
-
Fix:
arch/x86/kernel/entry_32.S:446: Warning: 00000000080001d1 shortened to 00000000000001d1
arch/x86/kernel/entry_32.S:457: Warning: 000000000800feff shortened to 000000000000feff
arch/x86/kernel/entry_32.S:527: Warning: 00000000080001d1 shortened to 00000000000001d1
arch/x86/kernel/entry_32.S:541: Warning: 000000000800feff shortened to 000000000000feff
arch/x86/kernel/entry_32.S:676: Warning: 0000000008000091 shortened to 0000000000000091TIF_SYSCALL_FTRACE is 0x08000000 and until now we checked the
first 16 bits of the work mask - bit 27 falls outside of that.Update the entry_32.S code to check the full 32-bit mask.
[ %cx => %ecx fix from Cyrill Gorcunov ]
Signed-off-by: Jaswinder Singh Rajput
Cc: Frederic Weisbecker
Cc: Steven Rostedt
Cc: "H. Peter Anvin"
LKML-Reference:
Signed-off-by: Ingo Molnar
24 Feb, 2009
1 commit
-
Impact: Cleanup
Checkin be44d2aabce2d62f72d5751d1871b6212bf7a1c7 eliminates the use of
a 16-bit stack for espfix. However, at least one instruction remained
that only operated on the low 16 bits of %esp.This is not a bug per se because the kernel stack is always an aligned
4K or 8K block. Therefore it cannot cross 64K boundaries; this code,
in fact, relies strictly on that fact.However, it's a lot cleaner (and, for that matter, smaller) to operate
on the entire 32-bit register.Signed-off-by: Stas Sergeev
CC: Zachary Amsden
CC: Chuck Ebbert
Signed-off-by: H. Peter Anvin
14 Feb, 2009
1 commit
-
In general, the only definitions that assembly files can use
are in _types.S headers (where available), so convert them.Signed-off-by: Jeremy Fitzhardinge
13 Feb, 2009
1 commit