03 Apr, 2010
3 commits
-
This patch just states the fact the cpusets/cpuhotplug interaction is
broken and removes the deadlockable code which only pretends to work.- cpuset_lock() doesn't really work. It is needed for
cpuset_cpus_allowed_locked() but we can't take this lock in
try_to_wake_up()->select_fallback_rq() path.- cpuset_lock() is deadlockable. Suppose that a task T bound to CPU takes
callback_mutex. If cpu_down(CPU) happens before T drops callback_mutex
stop_machine() preempts T, then migration_call(CPU_DEAD) tries to take
cpuset_lock() and hangs forever because CPU is already dead and thus
T can't be scheduled.- cpuset_cpus_allowed_locked() is deadlockable too. It takes task_lock()
which is not irq-safe, but try_to_wake_up() can be called from irq.Kill them, and change select_fallback_rq() to use cpu_possible_mask, like
we currently do without CONFIG_CPUSETS.Also, with or without this patch, with or without CONFIG_CPUSETS, the
callers of select_fallback_rq() can race with each other or with
set_cpus_allowed() pathes.The subsequent patches try to to fix these problems.
Signed-off-by: Oleg Nesterov
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
This is left over from commit 7c9414385e ("sched: Remove USER_SCHED"")
Signed-off-by: Li Zefan
Acked-by: Dhaval Giani
Signed-off-by: Peter Zijlstra
Cc: David Howells
LKML-Reference:
Signed-off-by: Ingo Molnar -
Merge reason: update to latest upstream
Signed-off-by: Ingo Molnar
30 Mar, 2010
6 commits
-
…s/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
CRED: Fix memory leak in error handling -
…git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Do not free zero sized per cpu areas
x86: Make sure free_init_pages() frees pages on page boundary
x86: Make smp_locks end with page alignment -
Fix a memory leak on an OOM condition in prepare_usermodehelper_creds().
Signed-off-by: Mathieu Desnoyers
Signed-off-by: David Howells
Signed-off-by: James Morris -
This avoids an infinite loop in free_early_partial().
Add a warning to free_early_partial() to catch future problems.
-v5: put back start > end back into WARN_ONCE()
-v6: use one line for warning, suggested by Linus
-v7: more tests
-v8: remove the function name as suggested by Johannes
WARN_ONCE() will print out that function name.Signed-off-by: Ian Campbell
Signed-off-by: Yinghai Lu
Tested-by: Konrad Rzeszutek Wilk
Tested-by: Joel Becker
Tested-by: Stanislaw Gruszka
Acked-by: Johannes Weiner
Cc: Peter Zijlstra
Cc: David Miller
Cc: Benjamin Herrenschmidt
Cc: Linus Torvalds
LKML-Reference:
Signed-off-by: Ingo Molnar -
CONFIG_SLOW_WORK_PROC was changed to CONFIG_SLOW_WORK_DEBUG, but not in all
instances. Change the remaining instances. This makes the debugfs file
display the time mark and the owner's description again.Signed-off-by: David Howells
Signed-off-by: Linus Torvalds -
Otherwise we can get an oops if the user has no get_ref/put_ref
requirement.Signed-off-by: Dave Airlie
Signed-off-by: David Howells
Signed-off-by: Linus Torvalds
27 Mar, 2010
6 commits
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI: truncate _CRS windows with _LEN > _MAX - _MIN + 1
x86/PCI: for host bridge address space collisions, show conflicting resource
frv/PCI: remove redundant warnings
x86/PCI: remove redundant warnings
PCI: don't say we claimed a resource if we failed
PCI quirk: Disable MSI on VIA K8T890 systems
PCI quirk: RS780/RS880: work around missing MSI initialization
PCI quirk: only apply CX700 PCI bus parking quirk if external VT6212L is present
PCI: complain about devices that seem to be broken
PCI: print resources consistently with %pR
PCI: make disabled window printk style match the enabled ones
PCI: break out primary/secondary/subordinate for readability
PCI: for address space collisions, show conflicting resource
resources: add interfaces that return conflict information
PCI: cleanup error return for pcix get and set mmrbc functions
PCI: fix access of PCI_X_CMD by pcix get and set mmrbc functions
PCI: kill off pci_register_set_vga_state() symbol export.
PCI: fix return value from pcix_get_max_mmrbc() -
…el/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
time: Fix accumulation bug triggered by long delay.
posix-cpu-timers: Reset expire cache when no timer is running
timer stats: Fix del_timer_sync() and try_to_del_timer_sync()
clockevents: Sanitize min_delta_ns adjustment and prevent overflows -
…nel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ring-buffer: Do 8 byte alignment for 64 bit that can not handle 4 byte align -
…l/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Use proper type in sched_getaffinity()
kernel/sched.c: Suppress unused var warning
sched: sched_getaffinity(): Allow less than NR_CPUS length -
…git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Move two IRQ functions from .init.text to .text
genirq: Protect access to irq_desc->action in can_request_irq()
genirq: Prevent oneshot irq thread race -
…/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Remove excessive early_res debug output
softlockup: Stop spurious softlockup messages due to overflow
rcu: Fix local_irq_disable() CONFIG_PROVE_RCU=y false positives
rcu: Fix tracepoints & lockdep false positive
rcu: Make rcu_read_lock_bh_held() allow for disabled BH
25 Mar, 2010
3 commits
-
Signed-off-by: Miao Xie
Acked-by: David Rientjes
Cc: Nick Piggin
Cc: Paul Menage
Cc: Li Zefan
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
cpuset_mem_spread_node() returns an offline node, and causes an oops.
This patch fixes it by initializing task->mems_allowed to
node_states[N_HIGH_MEMORY], and updating task->mems_allowed when doing
memory hotplug.Signed-off-by: Miao Xie
Acked-by: David Rientjes
Reported-by: Nick Piggin
Tested-by: Nick Piggin
Cc: Paul Menage
Cc: Li Zefan
Cc: Ingo Molnar
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
commit e6a1105b ("cgroups: subsystem module loading interface") and commit
c50cc752 ("sched, cgroups: Fix module export") result in duplicate
including of module.hSigned-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
24 Mar, 2010
3 commits
-
Both functions should not be marked as __init, since they be called
from modules after the init section is freed.Signed-off-by: Henrik Kretzschmar
Cc: Yinghai Lu
Cc: Peter Zijlstra
Cc: Jiri Kosina
LKML-Reference:
Signed-off-by: Thomas Gleixner -
can_request_irq() accesses and dereferences irq_desc->action w/o
holding irq_desc->lock. So action can be freed on another CPU before
it's dereferenced. Unlikely, but ...Protect it with desc->lock.
Signed-off-by: Thomas Gleixner
-
request_resource() and insert_resource() only return success or failure,
which no information about what existing resource conflicted with the
proposed new reservation. This patch adds request_resource_conflict()
and insert_resource_conflict(), which return the conflicting resource.Callers may use this for better error messages or to adjust the new
resource and retry the request.Signed-off-by: Bjorn Helgaas
Signed-off-by: Jesse Barnes
23 Mar, 2010
1 commit
-
The logarithmic accumulation done in the timekeeping has some overflow
protection that limits the max shift value. That means it will take
more then shift loops to accumulate all of the cycles. This causes
the shift decrement to underflow, which causes the loop to never exit.The simplest fix would be simply to do a:
if (shift)
shift--;However that is not optimal, as we know the cycle offset is larger
then the interval << shift, the above would make shift drop to zero,
then we would be spinning for quite awhile accumulating at interval
chunks at a time.Instead, this patch only decreases shift if the offset is smaller
then cycle_interval << shift. This makes sure we accumulate using
the largest chunks possible without overflowing tick_length, and limits
the number of iterations through the loop.This issue was found and reported by Sonic Zhang, who also tested the fix.
Many thanks your explanation and testing!Reported-by: Sonic Zhang
Signed-off-by: John Stultz
Tested-by: Sonic Zhang
LKML-Reference:
Signed-off-by: Thomas Gleixner
22 Mar, 2010
1 commit
-
Ensure additions on touch_ts do not overflow. This can occur
when the top 32 bits of the TSC reach 0xffffffff causing
additions to touch_ts to overflow and this in turn generates
spurious softlockup warnings.Signed-off-by: Colin Ian King
Cc: Peter Zijlstra
Cc: Eric Dumazet
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar
19 Mar, 2010
2 commits
-
The ring buffer uses 4 byte alignment while recording events into the
buffer, even on 64bit machines. This saves space when there are lots
of events being recorded at 4 byte boundaries.The ring buffer has a zero copy method to write into the buffer, with
the reserving of space and then committing it. This may cause problems
when writing an 8 byte word into a 4 byte alignment (not 8). For x86 and
PPC this is not an issue, but on some architectures this would cause an
out-of-alignment exception.This patch uses CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to determine
if it is OK to use 4 byte alignments on 64 bit machines. If it is not,
it forces the ring buffer event header to be 8 bytes and not 4,
and will align the length of the data to be 8 byte aligned.
This keeps the data payload at 8 byte alignments and will allow these
machines to run without issue.The trick to this is that the header can be either 4 bytes or 8 bytes
depending on the length of the data payload. The 4 byte header
has a length field that supports up to 112 bytes. If the length of
the data is more than 112, the length field is set to zero, and the actual
length is stored in the next 4 bytes after the header.When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is not set, the code forces
zero in the 4 byte header forcing the length to be stored in the 4 byte
array, even with a small data load. It also forces the length of the
data load to be 8 byte aligned. The combination of these two guarantee
that the data is always at 8 byte alignment.Tested-by: Frederic Weisbecker
(on sparc64)
Reported-by: Frederic Weisbecker
Acked-by: David S. Miller
Signed-off-by: Steven Rostedt -
…/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits)
perf: Fix unexported generic perf_arch_fetch_caller_regs
perf record: Don't try to find buildids in a zero sized file
perf: export perf_trace_regs and perf_arch_fetch_caller_regs
perf, x86: Fix hw_perf_enable() event assignment
perf, ppc: Fix compile error due to new cpu notifiers
perf: Make the install relative to DESTDIR if specified
kprobes: Calculate the index correctly when freeing the out-of-line execution slot
perf tools: Fix sparse CPU numbering related bugs
perf_event: Fix oops triggered by cpu offline/online
perf: Drop the obsolete profile naming for trace events
perf: Take a hot regs snapshot for trace events
perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot
perf/x86-64: Use frame pointer to walk on irq and process stacks
lockdep: Move lock events under lockdep recursion protection
perf report: Print the map table just after samples for which no map was found
perf report: Add multiple event support
perf session: Change perf_session post processing functions to take histogram tree
perf session: Add storage for seperating event types in report
perf session: Change add_hist_entry to take the tree root instead of session
perf record: Add ID and to recorded event data when recording multiple events
...
17 Mar, 2010
2 commits
-
perf_arch_fetch_caller_regs() is exported for the overriden x86
version, but not for the generic weak version.As a general rule, weak functions should not have their symbol
exported in the same file they are defined.So let's export it on trace_event_perf.c as it is used by trace
events only.This fixes:
ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined!
ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined!-v2: And also only build it if trace events are enabled.
-v3: Fix changelog mistakeReported-by: Stephen Rothwell
Signed-off-by: Frederic Weisbecker
Cc: Peter Zijlstra
Cc: Xiao Guangrong
Cc: Paul Mackerras
LKML-Reference:
Signed-off-by: Ingo Molnar -
Using the proper type fixes the following compiler warning:
kernel/sched.c:4850: warning: comparison of distinct pointer types lacks a cast
Signed-off-by: KOSAKI Motohiro
Cc: torvalds@linux-foundation.org
Cc: travis@sgi.com
Cc: peterz@infradead.org
Cc: drepper@redhat.com
Cc: rja@sgi.com
Cc: sharyath@in.ibm.com
Cc: steiner@sgi.com
LKML-Reference:
Signed-off-by: Ingo Molnar
16 Mar, 2010
3 commits
-
On UP:
kernel/sched.c: In function 'wake_up_new_task':
kernel/sched.c:2631: warning: unused variable 'cpu'Signed-off-by: Andrew Morton
Cc: Peter Zijlstra
Signed-off-by: Ingo Molnar -
This was left over from "7c9414385e sched: Remove USER_SCHED"
Signed-off-by: Dan Carpenter
Acked-by: Dhaval Giani
Cc: Kay Sievers
Cc: Greg Kroah-Hartman
LKML-Reference:
Signed-off-by: Ingo Molnar -
Disabling BH can stand in for rcu_read_lock_bh(), and this patch
updates rcu_read_lock_bh_held() to allow for this. In order to
avoid include-file hell, this function is moved out of line to
kernel/rcupdate.c.This fixes a false positive RCU warning.
Reported-by: Arnd Bergmann
Reported-by: Eric Dumazet
Signed-off-by: Paul E. McKenney
Acked-by: Lai Jiangshan
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference:
Signed-off-by: Ingo Molnar
15 Mar, 2010
1 commit
-
[ Note, this commit changes the syscall ABI for > 1024 CPUs systems. ]
Recently, some distro decided to use NR_CPUS=4096 for mysterious reasons.
Unfortunately, glibc sched interface has the following definition:# define __CPU_SETSIZE 1024
# define __NCPUBITS (8 * sizeof (__cpu_mask))
typedef unsigned long int __cpu_mask;
typedef struct
{
__cpu_mask __bits[__CPU_SETSIZE / __NCPUBITS];
} cpu_set_t;It mean, if NR_CPUS is bigger than 1024, cpu_set_t makes an
ABI issue ...More recently, Sharyathi Nagesh reported following test program makes
misterious syscall failure:-----------------------------------------------------------------------
#define _GNU_SOURCE
#include
#include
#includeint main()
{
cpu_set_t set;
if (sched_getaffinity(0, sizeof(cpu_set_t), &set) < 0)
printf("\n Call is failing with:%d", errno);
}
-----------------------------------------------------------------------Because the kernel assumes len argument of sched_getaffinity() is bigger
than NR_CPUS. But now it is not correct.Now we are faced with the following annoying dilemma, due to
the limitations of the glibc interface built in years ago:(1) if we change glibc's __CPU_SETSIZE definition, we lost
binary compatibility of _all_ application.(2) if we don't change it, we also lost binary compatibility of
Sharyathi's use case.Then, I would propse to change the rule of the len argument of
sched_getaffinity().Old:
len should be bigger than NR_CPUS
New:
len should be bigger than maximum possible cpu idThis creates the following behavior:
(A) In the real 4096 cpus machine, the above test program still
return -EINVAL.(B) NR_CPUS=4096 but the machine have less than 1024 cpus (almost
all machines in the world), the above can run successfully.Fortunatelly, BIG SGI machine is mainly used for HPC use case. It means
they can rebuild their programs.IOW we hope they are not annoyed by this issue ...
Reported-by: Sharyathi Nagesh
Signed-off-by: KOSAKI Motohiro
Acked-by: Ulrich Drepper
Acked-by: Peter Zijlstra
Cc: Linus Torvalds
Cc: Andrew Morton
Cc: Jack Steiner
Cc: Russ Anderson
Cc: Mike Travis
LKML-Reference:
Signed-off-by: Ingo Molnar
14 Mar, 2010
4 commits
-
…l/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix pick_next_highest_task_rt() for cgroups
sched: Cleanup: remove unused variable in try_to_wake_up()
x86: Fix sched_clock_cpu for systems with unsynchronized TSC -
…/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
locking: Make sparse work with inline spinlocks and rwlocks
x86/mce: Fix RCU lockdep splats
rcu: Increase RCU CPU stall timeouts if PROVE_RCU
ftrace: Replace read_barrier_depends() with rcu_dereference_raw()
rcu: Suppress RCU lockdep warnings during early boot
rcu, ftrace: Fix RCU lockdep splat in ftrace_perf_buf_prepare()
rcu: Suppress __mpol_dup() false positive from RCU lockdep
rcu: Make rcu_read_lock_sched_held() handle !PREEMPT
rcu: Add control variables to lockdep_rcu_dereference() diagnostics
rcu, cgroup: Relax the check in task_subsys_state() as early boot is now handled by lockdep-RCU
rcu: Use wrapper function instead of exporting tasklist_lock
sched, rcu: Fix rcu_dereference() for RCU-lockdep
rcu: Make task_subsys_state() RCU-lockdep checks handle boot-time use
rcu: Fix holdoff for accelerated GPs for last non-dynticked CPU
x86/gart: Unexport gart_iommu_apertureFix trivial conflicts in kernel/trace/ftrace.c
-
…nel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Do not record user stack trace from NMI context
tracing: Disable buffer switching when starting or stopping trace
tracing: Use same local variable when resetting the ring buffer
function-graph: Init curr_ret_stack with ret_stack
ring-buffer: Move disabled check into preempt disable section
function-graph: Add tracing_thresh support to function_graph tracer
tracing: Update the comm field in the right variable in update_max_tr
function-graph: Use comment notation for func names of dangling '}'
function-graph: Fix unused reference to ftrace_set_func()
tracing: Fix warning in s_next of trace file ops
tracing: Include irqflags headers from trace clock -
…/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Provide generic perf_sample_data initialization
MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer
perf trace: Don't use pager if scripting
perf trace/scripting: Remove extraneous header read
perf, ARM: Modify kuser rmb() call to compile for Thumb-2
x86/stacktrace: Don't dereference bad frame pointers
perf archive: Don't try to collect files without a build-id
perf_events, x86: Fixup fixed counter constraints
perf, x86: Restrict the ANY flag
perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE
perf, x86: add some IBS macros to perf_event.h
perf, x86: make IBS macros available in perf_event.h
hw-breakpoints: Remove stub unthrottle callback
x86/hw-breakpoints: Remove the name field
perf: Remove pointless breakpoint union
perf lock: Drop the buffers multiplexing dependency
perf lock: Fix and add misc documentally things
percpu: Add __percpu sparse annotations to hw_breakpoint
13 Mar, 2010
5 commits
-
A bug was found with Li Zefan's ftrace_stress_test that caused applications
to segfault during the test.Placing a tracing_off() in the segfault code, and examining several
traces, I found that the following was always the case. The lock tracer
was enabled (lockdep being required) and userstack was enabled. Testing
this out, I just enabled the two, but that was not good enough. I needed
to run something else that could trigger it. Running a load like hackbench
did not work, but executing a new program would. The following would
trigger the segfault within seconds:# echo 1 > /debug/tracing/options/userstacktrace
# echo 1 > /debug/tracing/events/lock/enable
# while :; do ls > /dev/null ; doneEnabling the function graph tracer and looking at what was happening
I finally noticed that all cashes happened just after an NMI.1) | copy_user_handle_tail() {
1) | bad_area_nosemaphore() {
1) | __bad_area_nosemaphore() {
1) | no_context() {
1) | fixup_exception() {
1) 0.319 us | search_exception_tables();
1) 0.873 us | }
[...]
1) 0.314 us | __rcu_read_unlock();
1) 0.325 us | native_apic_mem_write();
1) 0.943 us | }
1) 0.304 us | rcu_nmi_exit();
[...]
1) 0.479 us | find_vma();
1) | bad_area() {
1) | __bad_area() {After capturing several traces of failures, all of them happened
after an NMI. Curious about this, I added a trace_printk() to the NMI
handler to read the regs->ip to see where the NMI happened. In which I
found out it was here:ffffffff8135b660 :
ffffffff8135b660: 48 83 ec 78 sub $0x78,%rsp
ffffffff8135b664: e8 97 01 00 00 callq ffffffff8135b800What was happening is that the NMI would happen at the place that a page
fault occurred. It would call rcu_read_lock() which was traced by
the lock events, and the user_stack_trace would run. This would trigger
a page fault inside the NMI. I do not see where the CR2 register is
saved or restored in NMI handling. This means that it would corrupt
the page fault handling that the NMI interrupted.The reason the while loop of ls helped trigger the bug, was that
each execution of ls would cause lots of pages to be faulted in, and
increase the chances of the race happening.The simple solution is to not allow user stack traces in NMI context.
After this patch, I ran the above "ls" test for a couple of hours
without any issues. Without this patch, the bug would trigger in less
than a minute.Cc: stable@kernel.org
Reported-by: Li Zefan
Signed-off-by: Steven Rostedt -
When the trace iterator is read, tracing_start() and tracing_stop()
is called to stop tracing while the iterator is processing the trace
output.These functions disable both the standard buffer and the max latency
buffer. But if the wakeup tracer is running, it can switch these
buffers between the two disables:buffer = global_trace.buffer;
if (buffer)
ring_buffer_record_disable(buffer);<<
Signed-off-by: Steven Rostedt -
In the ftrace code that resets the ring buffer it references the
buffer with a local variable, but then uses the tr->buffer as the
parameter to reset. If the wakeup tracer is running, which can
switch the tr->buffer with the max saved buffer, this can break
the requirement of disabling the buffer before the reset.buffer = tr->buffer;
ring_buffer_record_disable(buffer);
synchronize_sched();
__tracing_reset(tr->buffer, cpu);If the tr->buffer is swapped, then the reset is not happening to the
buffer that was disabled. This will cause the ring buffer to fail.Found with Li Zefan's ftrace_stress_test.
Cc: stable@kernel.org
Reported-by: Lai Jiangshan
Signed-off-by: Steven Rostedt -
If the graph tracer is active, and a task is forked but the allocating of
the processes graph stack fails, it can cause crash later on.This is due to the temporary stack being NULL, but the curr_ret_stack
variable is copied from the parent. If it is not -1, then in
ftrace_graph_probe_sched_switch() the following:for (index = next->curr_ret_stack; index >= 0; index--)
next->ret_stack[index].calltime += timestamp;Will cause a kernel OOPS.
Found with Li Zefan's ftrace_stress_test.
Cc: stable@kernel.org
Signed-off-by: Steven Rostedt -
The ring buffer resizing and resetting relies on a schedule RCU
action. The buffers are disabled, a synchronize_sched() is called
and then the resize or reset takes place.But this only works if the disabling of the buffers are within the
preempt disabled section, otherwise a window exists that the buffers
can be written to while a reset or resize takes place.Cc: stable@kernel.org
Reported-by: Li Zefan
Signed-off-by: Lai Jiangshan
LKML-Reference:
Signed-off-by: Steven Rostedt