14 Dec, 2017
1 commit
-
commit c07d35338081d107e57cf37572d8cc931a8e32e2 upstream.
kallsyms_symbol_next() returns a boolean (true on success). Currently
kdb_read() tests the return value with an inequality that
unconditionally evaluates to true.This is fixed in the obvious way and, since the conditional branch is
supposed to be unreachable, we also add a WARN_ON().Reported-by: Dan Carpenter
Signed-off-by: Daniel Thompson
Signed-off-by: Jason Wessel
Signed-off-by: Greg Kroah-Hartman
02 Mar, 2017
6 commits
-
We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
We are going to move softlockup APIs out of , which
will have to be picked up from other headers and a couple of .c files.already includes .
Include the header in the files that are going to need it.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
We are going to split out of , which
will have to be picked up from a couple of .c files.Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
…sched.h> to <linux/mm_types>
The <linux/sched.h> header includes various vmacache related defines,
which are arguably misplaced.Move them to mm_types.h and minimize the sched.h impact by putting
all task vmacache state into a new 'struct vmacache' structure.No change in functionality.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
15 Dec, 2016
4 commits
-
kdb_trap_printk allows to pass normal printk() messages to kdb via
vkdb_printk(). For example, it is used to get backtrace using the
classic show_stack(), see kdb_show_stack().vkdb_printf() tries to avoid a potential infinite loop by disabling the
trap. But this approach is racy, for example:CPU1 CPU2
vkdb_printf()
// assume that kdb_trap_printk == 0
saved_trap_printk = kdb_trap_printk;
kdb_trap_printk = 0;kdb_show_stack()
kdb_trap_printk++;Problem1: Now, a nested printk() on CPU0 calls vkdb_printf()
even when it should have been disabled. It will not
cause a deadlock but...// using the outdated saved value: 0
kdb_trap_printk = saved_trap_printk;kdb_trap_printk--;
Problem2: Now, kdb_trap_printk == -1 and will stay like this.
It means that all messages will get passed to kdb from
now on.This patch removes the racy saved_trap_printk handling. Instead, the
recursion is prevented by a check for the locked CPU.The solution is still kind of racy. A non-related printk(), from
another process, might get trapped by vkdb_printf(). And the wanted
printk() might not get trapped because kdb_printf_cpu is assigned. But
this problem existed even with the original code.A proper solution would be to get_cpu() before setting kdb_trap_printk
and trap messages only from this CPU. I am not sure if it is worth the
effort, though.In fact, the race is very theoretical. When kdb is running any of the
commands that use kdb_trap_printk there is a single active CPU and the
other CPUs should be in a holding pen inside kgdb_cpu_enter().The only time this is violated is when there is a timeout waiting for
the other CPUs to report to the holding pen.Finally, note that the situation is a bit schizophrenic. vkdb_printf()
explicitly allows recursion but only from KDB code that calls
kdb_printf() directly. On the other hand, the generic printk()
recursion is not allowed because it might cause an infinite loop. This
is why we could not hide the decision inside vkdb_printf() easily.Link: http://lkml.kernel.org/r/1480412276-16690-4-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek
Cc: Daniel Thompson
Cc: Jason Wessel
Cc: Peter Zijlstra
Cc: Sergey Senozhatsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
kdb_printf_lock does not prevent other CPUs from entering the critical
section because it is ignored when KDB_STATE_PRINTF_LOCK is set.The problematic situation might look like:
CPU0 CPU1
vkdb_printf()
if (!KDB_STATE(PRINTF_LOCK))
KDB_STATE_SET(PRINTF_LOCK);
spin_lock_irqsave(&kdb_printf_lock, flags);vkdb_printf()
if (!KDB_STATE(PRINTF_LOCK))BANG: The PRINTF_LOCK state is set and CPU1 is entering the critical
section without spinning on the lock.The problem is that the code tries to implement locking using two state
variables that are not handled atomically. Well, we need a custom
locking because we want to allow reentering the critical section on the
very same CPU.Let's use solution from Petr Zijlstra that was proposed for a similar
scenario, see
https://lkml.kernel.org/r/20161018171513.734367391@infradead.orgThis patch uses the same trick with cmpxchg(). The only difference is
that we want to handle only recursion from the same context and
therefore we disable interrupts.In addition, KDB_STATE_PRINTF_LOCK is removed. In fact, we are not able
to set it a non-racy way.Link: http://lkml.kernel.org/r/1480412276-16690-3-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek
Reviewed-by: Daniel Thompson
Cc: Jason Wessel
Cc: Peter Zijlstra
Cc: Sergey Senozhatsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
kdb_event state variable is only set but never checked in the kernel
code.http://www.spinics.net/lists/kdb/msg01733.html suggests that this
variable affected WARN_CONSOLE_UNLOCKED() in the original
implementation. But this check never went upstream.The semantic is unclear and racy. The value is updated after the
kdb_printf_lock is acquired and after it is released. It should be
symmetric at minimum. The value should be manipulated either inside or
outside the locked area.Fortunately, it seems that the original function is gone and we could
simply remove the state variable.Link: http://lkml.kernel.org/r/1480412276-16690-2-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek
Suggested-by: Daniel Thompson
Cc: Jason Wessel
Cc: Peter Zijlstra
Cc: Sergey Senozhatsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We've got a delay loop waiting for secondary CPUs. That loop uses
loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures. It is quite
common to see things like this in the boot log:Calibrating delay loop (skipped), value calculated using timer
frequency.. 48.00 BogoMIPS (lpj=24000)In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout. We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.[akpm@linux-foundation.org: simplifications, per Daniel]
Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org
Signed-off-by: Douglas Anderson
Reviewed-by: Daniel Thompson
Cc: Jason Wessel
Cc: Brian Norris
Cc: [4.0+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Dec, 2016
1 commit
-
Commit 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") allows to define more message headers for a single
message. The motivation is that continuous lines might get mixed.
Therefore it make sense to define the right log level for every piece of
a cont line.This patch introduces printk_skip_headers() that will skip all headers
and uses it in the kdb code instead of printk_skip_level().This approach helps to fix other printk_skip_level() users
independently.Link: http://lkml.kernel.org/r/1478695291-12169-3-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek
Cc: Joe Perches
Cc: Sergey Senozhatsky
Cc: Steven Rostedt
Cc: Jason Wessel
Cc: Jaroslav Kysela
Cc: Takashi Iwai
Cc: Chris Mason
Cc: Josef Bacik
Cc: David Sterba
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Feb, 2016
1 commit
-
It may be useful to debug writes to the readonly sections of memory,
so provide a cmdline "rodata=off" to allow for this. This can be
expanded in the future to support "log" and "write" modes, but that
will need to be architecture-specific.This also makes KDB software breakpoints more usable, as read-only
mappings can now be disabled on any kernel.Suggested-by: H. Peter Anvin
Signed-off-by: Kees Cook
Cc: Andy Lutomirski
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: David Brown
Cc: Denys Vlasenko
Cc: Emese Revfy
Cc: Linus Torvalds
Cc: Mathias Krause
Cc: Michael Ellerman
Cc: PaX Team
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: kernel-hardening@lists.openwall.com
Cc: linux-arch
Link: http://lkml.kernel.org/r/1455748879-21872-3-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar
05 Dec, 2015
1 commit
-
Makes it easier to handle init vs core cleanly, though the change is
fairly invasive across random architectures.It simplifies the rbtree code immediately, however, while keeping the
core data together in the same cachline (now iff the rbtree code is
enabled).Acked-by: Peter Zijlstra
Reviewed-by: Josh Poimboeuf
Signed-off-by: Rusty Russell
Signed-off-by: Jiri Kosina
20 Feb, 2015
8 commits
-
On non-developer devices, kgdb prevents the device from rebooting
after a panic.Incase of panics and exceptions, to allow the device to reboot, prevent
entering debug mode to avoid getting stuck waiting for the user to
interact with debugger.To avoid entering the debugger on panic/exception without any extra
configuration, panic_timeout is being used which can be set via
/proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the
default value.Setting panic_timeout indicates that the user requested machine to
perform unattended reboot after panic. We dont want to get stuck waiting
for the user input incase of panic.Cc: Andrew Morton
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
Cc: Android Kernel Team
Cc: John Stultz
Cc: Sumit Semwal
Signed-off-by: Colin Cross
[Kiran: Added context to commit message.
panic_timeout is used instead of break_on_panic and
break_on_exception to honor CONFIG_PANIC_TIMEOUT
Modified the commit as per community feedback]
Signed-off-by: Kiran Raparthy
Signed-off-by: Daniel Thompson
Signed-off-by: Jason Wessel -
All current callers of kdb_getstr() can pass constant pointers via the
prompt argument. This patch adds a const qualification to make explicit
the fact that this is safe.Signed-off-by: Daniel Thompson
Signed-off-by: Jason Wessel -
Currently kdb allows the output of comamnds to be filtered using the
| grep feature. This is useful but does not permit the output emitted
shortly after a string match to be examined without wading through the
entire unfiltered output of the command. Such a feature is particularly
useful to navigate function traces because these traces often have a
useful trigger string *before* the point of interest.This patch reuses the existing filtering logic to introduce a simple
forward search to kdb that can be triggered from the more prompt.Signed-off-by: Daniel Thompson
Signed-off-by: Jason Wessel -
Currently when the "| grep" feature is used to filter the output of a
command then the prompt is not displayed for the subsequent command.
Likewise any characters typed by the user are also not echoed to the
display. This rather disconcerting problem eventually corrects itself
when the user presses Enter and the kdb_grepping_flag is cleared as
kdb_parse() tries to make sense of whatever they typed.This patch resolves the problem by moving the clearing of this flag
from the middle of command processing to the beginning.Signed-off-by: Daniel Thompson
Signed-off-by: Jason Wessel -
Issuing a stack dump feels ergonomically wrong when entering due to NMI.
Entering due to NMI is normally a reaction to a user request, either the
NMI button on a server or a "magic knock" on a UART. Therefore the
backtrace behaviour on entry due to NMI should be like SysRq-g (no stack
dump) rather than like oops.Note also that the stack dump does not offer any information that
cannot be trivial retrieved using the 'bt' command.Signed-off-by: Daniel Thompson
Signed-off-by: Jason Wessel -
Currently when kdb traps printk messages then the raw log level prefix
(consisting of '\001' followed by a numeral) does not get stripped off
before the message is issued to the various I/O handlers supported by
kdb. This causes annoying visual noise as well as causing problems
grepping for ^. It is also a change of behaviour compared to normal usage
of printk() usage. For example -h ends up with different output to
that of kdb's "sr h".This patch addresses the problem by stripping log levels from messages
before they are issued to the I/O handlers. printk() which can also
act as an i/o handler in some cases is special cased; if the caller
provided a log level then the prefix will be preserved when sent to
printk().The addition of non-printable characters to the output of kdb commands is a
regression, albeit and extremely elderly one, introduced by commit
04d2c8c83d0e ("printk: convert the format for KERN_ to a 2 byte
pattern"). Note also that this patch does *not* restore the original
behaviour from v3.5. Instead it makes printk() from within a kdb command
display the message without any prefix (i.e. like printk() normally does).Signed-off-by: Daniel Thompson
Cc: Joe Perches
Cc: stable@vger.kernel.org
Signed-off-by: Jason Wessel -
There was a follow on replacement patch against the prior
"kgdb: Timeout if secondary CPUs ignore the roundup".See: https://lkml.org/lkml/2015/1/7/442
This patch is the delta vs the patch that was committed upstream:
* Fix an off-by-one error in kdb_cpu().
* Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we
really want a static limit.
* Removed the "KGDB: " prefix from the pr_crit() in debug_core.c
(kgdb-next contains a patch which introduced pr_fmt() to this file
to the tag will now be applied automatically).Cc: Daniel Thompson
Cc:
Signed-off-by: Jason Wessel -
The output of KDB 'summary' command should report MemTotal, MemFree
and Buffers output in kB. Current codes report in unit of pages.A define of K(x) as
is defined in the code, but not used.This patch would apply the define to convert the values to kB.
Please include me on Cc on replies. I do not subscribe to linux-kernel.Signed-off-by: Jay Lan
Cc:
Signed-off-by: Jason Wessel
23 Jan, 2015
1 commit
-
Pull module and param fixes from Rusty Russell:
"Surprising number of fixes this merge window :(The first two are minor fallout from the param rework which went in
this merge window.The next three are a series which fixes a longstanding (but never
previously reported and unlikely , so no CC stable) race between
kallsyms and freeing the init section.Finally, a minor cleanup as our module refcount will now be -1 during
unload"* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: make module_refcount() a signed integer.
module: fix race in kallsyms resolution during module load success.
module: remove mod arg from module_free, rename module_memfree().
module_arch_freeing_init(): new hook for archs before module->module_init freed.
param: fix uninitialized read with CONFIG_DEBUG_LOCK_ALLOC
param: initialize store function to NULL if not available.
22 Jan, 2015
1 commit
-
James Bottomley points out that it will be -1 during unload. It's
only used for diagnostics, so let's not hide that as it could be a
clue as to what's gone wrong.Cc: Jason Wessel
Acked-and-documention-added-by: James Bottomley
Reviewed-by: Masami Hiramatsu
Signed-off-by: Rusty Russell
11 Nov, 2014
10 commits
-
-Convert printk( to pr_foo()
-Add pr_fmt
-Coalesce formatsCc: Jason Wessel
Cc: Andrew Morton
Cc: Joe Perches
Signed-off-by: Fabian Frederick
Signed-off-by: Jason Wessel -
Currently if an active CPU fails to respond to a roundup request the CPU
that requested the roundup will become stuck. This needlessly reduces the
robustness of the debugger.This patch introduces a timeout allowing the system state to be examined
even when the system contains unresponsive processors. It also modifies
kdb's cpu command to make it censor attempts to switch to unresponsive
processors and to report their state as (D)ead.Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Andrew Morton
Signed-off-by: Jason Wessel -
Currently kiosk mode must be explicitly requested by the bootloader or
userspace. It is convenient to be able to change the default value in a
similar manner to CONFIG_MAGIC_SYSRQ_DEFAULT_MASK.Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel -
Currently all kdb commands are enabled whenever kdb is deployed. This
makes it difficult to deploy kdb to help debug certain types of
systems.Android phones provide one example; the FIQ debugger found on some
Android devices has a deliberately weak set of commands to allow the
debugger to enabled very late in the production cycle.Certain kiosk environments offer another interesting case where an
engineer might wish to probe the system state using passive inspection
commands without providing sufficient power for a passer by to root it.Without any restrictions, obtaining the root rights via KDB is a matter of
a few commands, and works everywhere. For example, log in as a normal
user:cbou:~$ id
uid=1001(cbou) gid=1001(cbou) groups=1001(cbou)Now enter KDB (for example via sysrq):
Entering kdb (current=0xffff8800065bc740, pid 920) due to Keyboard Entry
kdb> ps
23 sleeping system daemon (state M) processes suppressed,
use 'ps A' to see all.
Task Addr Pid Parent [*] cpu State Thread Command
0xffff8800065bc740 920 919 1 0 R 0xffff8800065bca20 *bash0xffff880007078000 1 0 0 0 S 0xffff8800070782e0 init
[...snip...]
0xffff8800065be3c0 918 1 0 0 S 0xffff8800065be6a0 getty
0xffff8800065b9c80 919 1 0 0 S 0xffff8800065b9f60 login
0xffff8800065bc740 920 919 1 0 R 0xffff8800065bca20 *bashAll we need is the offset of cred pointers. We can look up the offset in
the distro's kernel source, but it is unnecessary. We can just start
dumping init's task_struct, until we see the process name:kdb> md 0xffff880007078000
0xffff880007078000 0000000000000001 ffff88000703c000 ................
0xffff880007078010 0040210000000002 0000000000000000 .....!@.........
[...snip...]
0xffff8800070782b0 ffff8800073e0580 ffff8800073e0580 ..>.......>.....
0xffff8800070782c0 0000000074696e69 0000000000000000 init............^ Here, 'init'. Creds are just above it, so the offset is 0x02b0.
Now we set up init's creds for our non-privileged shell:
kdb> mm 0xffff8800065bc740+0x02b0 0xffff8800073e0580
0xffff8800065bc9f0 = 0xffff8800073e0580
kdb> mm 0xffff8800065bc740+0x02b8 0xffff8800073e0580
0xffff8800065bc9f8 = 0xffff8800073e0580And thus gaining the root:
kdb> go
cbou:~$ id
uid=0(root) gid=0(root) groups=0(root)
cbou:~$ bash
root:~#p.s. No distro enables kdb by default (although, with a nice KDB-over-KMS
feature availability, I would expect at least some would enable it), so
it's not actually some kind of a major issue.Signed-off-by: Anton Vorontsov
Signed-off-by: John Stultz
Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel -
This patch introduces several new flags to collect kdb commands into
groups (later allowing them to be optionally disabled).This follows similar prior art to enable/disable magic sysrq
commands.The commands have been categorized as follows:
Always on: go (w/o args), env, set, help, ?, cpu (w/o args), sr,
dmesg, disable_nmi, defcmd, summary, grephelp
Mem read: md, mdr, mdp, mds, ef, bt (with args), per_cpu
Mem write: mm
Reg read: rd
Reg write: go (with args), rm
Inspect: bt (w/o args), btp, bta, btc, btt, ps, pid, lsmod
Flow ctrl: bp, bl, bph, bc, be, bd, ss
Signal: kill
Reboot: reboot
All: cpu, kgdb, (and all of the above), nmi_consoleSigned-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel -
Since we now treat KDB_REPEAT_* as flags, there is no need to
pass KDB_REPEAT_NONE. It's just the default behaviour when no
flags are specified.Signed-off-by: Anton Vorontsov
Signed-off-by: John Stultz
Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel -
The actual values of KDB_REPEAT_* enum values and overall logic stayed
the same, but we now treat the values as flags.This makes it possible to add other flags and combine them, plus makes
the code a lot simpler and shorter. But functionality-wise, there should
be no changes.Signed-off-by: Anton Vorontsov
Signed-off-by: John Stultz
Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel -
We're about to add more options for commands behaviour, so let's give
a more generic name to the low-level kdb command registration function.There are just various renames, no functional changes.
Signed-off-by: Anton Vorontsov
Signed-off-by: John Stultz
Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel -
We're about to add more options for command behaviour, so let's expand
the meaning of kdb_repeat_t.So far we just do various renames, there should be no functional changes.
Signed-off-by: Anton Vorontsov
Signed-off-by: John Stultz
Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel -
The struct member is never used in the code, so we can remove it.
We will introduce real flags soon by renaming cmd_repeat to cmd_flags.
Signed-off-by: Anton Vorontsov
Signed-off-by: John Stultz
Signed-off-by: Daniel Thompson
Cc: Jason Wessel
Signed-off-by: Jason Wessel
14 Oct, 2014
1 commit
-
The kernel used to contain two functions for length-delimited,
case-insensitive string comparison, strnicmp with correct semantics and
a slightly buggy strncasecmp. The latter is the POSIX name, so strnicmp
was renamed to strncasecmp, and strnicmp made into a wrapper for the new
strncasecmp to avoid breaking existing users.To allow the compat wrapper strnicmp to be removed at some point in the
future, and to avoid the extra indirection cost, do
s/strnicmp/strncasecmp/g.Signed-off-by: Rasmus Villemoes
Cc: Jason Wessel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Jun, 2014
1 commit
-
do_posix_clock_monotonic_gettime() is a leftover from the initial
posix timer implementation which maps to ktime_get_ts().Signed-off-by: Thomas Gleixner
Cc: John Stultz
Cc: Peter Zijlstra
Cc: Jason Wessel
Link: http://lkml.kernel.org/r/20140611234607.261629142@linutronix.de
Signed-off-by: Thomas Gleixner
05 Jun, 2014
1 commit
-
... instead of naked numbers.
Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.Signed-off-by: Borislav Petkov
Acked-by: Kees Cook
Acked-by: Randy Dunlap
Cc: Joe Perches
Cc: Valdis Kletnieks
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
18 Apr, 2014
1 commit
-
Mostly scripted conversion of the smp_mb__* barriers.
Signed-off-by: Peter Zijlstra
Acked-by: Paul E. McKenney
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar
08 Apr, 2014
1 commit
-
This patch is a continuation of efforts trying to optimize find_vma(),
avoiding potentially expensive rbtree walks to locate a vma upon faults.
The original approach (https://lkml.org/lkml/2013/11/1/410), where the
largest vma was also cached, ended up being too specific and random,
thus further comparison with other approaches were needed. There are
two things to consider when dealing with this, the cache hit rate and
the latency of find_vma(). Improving the hit-rate does not necessarily
translate in finding the vma any faster, as the overhead of any fancy
caching schemes can be too high to consider.We currently cache the last used vma for the whole address space, which
provides a nice optimization, reducing the total cycles in find_vma() by
up to 250%, for workloads with good locality. On the other hand, this
simple scheme is pretty much useless for workloads with poor locality.
Analyzing ebizzy runs shows that, no matter how many threads are
running, the mmap_cache hit rate is less than 2%, and in many situations
below 1%.The proposed approach is to replace this scheme with a small per-thread
cache, maximizing hit rates at a very low maintenance cost.
Invalidations are performed by simply bumping up a 32-bit sequence
number. The only expensive operation is in the rare case of a seq
number overflow, where all caches that share the same address space are
flushed. Upon a miss, the proposed replacement policy is based on the
page number that contains the virtual address in question. Concretely,
the following results are seen on an 80 core, 8 socket x86-64 box:1) System bootup: Most programs are single threaded, so the per-thread
scheme does improve ~50% hit rate by just adding a few more slots to
the cache.+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 50.61% | 19.90 |
| patched | 73.45% | 13.58 |
+----------------+----------+------------------+2) Kernel build: This one is already pretty good with the current
approach as we're dealing with good locality.+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 75.28% | 11.03 |
| patched | 88.09% | 9.31 |
+----------------+----------+------------------+3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.
+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 70.66% | 17.14 |
| patched | 91.15% | 12.57 |
+----------------+----------+------------------+4) Ebizzy: There's a fair amount of variation from run to run, but this
approach always shows nearly perfect hit rates, while baseline is just
about non-existent. The amounts of cycles can fluctuate between
anywhere from ~60 to ~116 for the baseline scheme, but this approach
reduces it considerably. For instance, with 80 threads:+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 1.06% | 91.54 |
| patched | 99.97% | 14.18 |
+----------------+----------+------------------+[akpm@linux-foundation.org: fix nommu build, per Davidlohr]
[akpm@linux-foundation.org: document vmacache_valid() logic]
[akpm@linux-foundation.org: attempt to untangle header files]
[akpm@linux-foundation.org: add vmacache_find() BUG_ON]
[hughd@google.com: add vmacache_valid_mm() (from Oleg)]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: adjust and enhance comments]
Signed-off-by: Davidlohr Bueso
Reviewed-by: Rik van Riel
Acked-by: Linus Torvalds
Reviewed-by: Michel Lespinasse
Cc: Oleg Nesterov
Tested-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Feb, 2014
1 commit
-
The function kgdb_breakpoint() sets up break point at
compile time by calling arch_kgdb_breakpoint();
Though this call is surrounded by wmb() barrier,
the compile can still re-order the break point,
because this scheduling barrier is not a code motion
barrier in gcc.Making kgdb_breakpoint() as noinline solves this problem
of code reording around break point instruction and also
avoids problem of being called as inline function from
other placesMore details about discussion on this can be found here
http://comments.gmane.org/gmane.linux.ports.arm.kernel/269732Signed-off-by: Vijaya Kumar K
Acked-by: Will Deacon
Acked-by: Jason Wessel
Signed-off-by: Catalin Marinas