Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

14 Dec, 2014

1 commit

8f4385d59 Merge tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/g… ... Browse Code »

…it/rostedt/linux-trace

Pull tracing fixlet from Steven Rostedt:
"Remove unnecessary preempt_disable in printk()"

* tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
printk: Do not disable preemption for accessing printk_func

Linus Torvalds
2014-12-14 06:04:41 +0800

11 Dec, 2014

6 commits

1fb8915b9 printk: Do not disable preemption for accessing printk_func ... Browse Code »

As printk_func will either be the default function, or a per_cpu function
for the current CPU, there's no reason to disable preemption to access
it from printk. That's because if the printk_func is not the default
then the caller had better disabled preemption as they were the one to
change it.

Link: http://lkml.kernel.org/r/CA+55aFz5-_LKW4JHEBoWinN9_ouNcGRWAF2FUA35u46FRN-Kxw@mail.gmail.com

Suggested-by: Linus Torvalds
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2014-12-11 22:12:01 +0800
350e4f498 Merge tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull nmi-safe seq_buf printk update from Steven Rostedt:
"This code is a fork from the trace-3.19 pull as it needed the
trace_seq clean ups from that branch.

This code solves the issue of performing stack dumps from NMI context.
The issue is that printk() is not safe from NMI context as if the NMI
were to trigger when a printk() was being performed, the NMI could
deadlock from the printk() internal locks. This has been seen in
practice.

With lots of review from Petr Mladek, this code went through several
iterations, and we feel that it is now at a point of quality to be
accepted into mainline.

Here's what is contained in this patch set:

- Creates a "seq_buf" generic buffer utility that allows a descriptor
to be passed around where functions can write their own "printk()"
formatted strings into it. The generic version was pulled out of
the trace_seq() code that was made specifically for tracing.

- The seq_buf code was change to model the seq_file code. I have a
patch (not included for 3.19) that converts the seq_file.c code
over to use seq_buf.c like the trace_seq.c code does. This was
done to make sure that seq_buf.c is compatible with seq_file.c. I
may try to get that patch in for 3.20.

- The seq_buf.c file was moved to lib/ to remove it from being
dependent on CONFIG_TRACING.

- The printk() was updated to allow for a per_cpu "override" of the
internal calls. That is, instead of writing to the console, a call
to printk() may do something else. This made it easier to allow
the NMI to change what printk() does in order to call dump_stack()
without needing to update that code as well.

- Finally, the dump_stack from all CPUs via NMI code was converted to
use the seq_buf code. The caller to trigger the NMI code would
wait till all the NMIs finished, and then it would print the
seq_buf data to the console safely from a non NMI context

One added bonus is that this code also makes the NMI dump stack work
on PREEMPT_RT kernels. As printk() includes sleeping locks on
PREEMPT_RT, printk() only writes to console if the console does not
use any rt_mutex converted spin locks. Which a lot do"

* tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
x86/nmi: Fix use of unallocated cpumask_var_t
printk/percpu: Define printk_func when printk is not defined
x86/nmi: Perform a safe NMI stack trace on all CPUs
printk: Add per_cpu printk func to allow printk to be diverted
seq_buf: Move the seq_buf code to lib/
seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF
tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions
tracing: Have seq_buf use full buffer
seq_buf: Add seq_buf_can_fit() helper function
tracing: Add paranoid size check in trace_printk_seq()
tracing: Use trace_seq_used() and seq_buf_used() instead of len
tracing: Clean up tracing_fill_pipe_page()
seq_buf: Create seq_buf_used() to find out how much was written
tracing: Add a seq_buf_clear() helper and clear len and readpos in init
tracing: Convert seq_buf fields to be like seq_file fields
tracing: Convert seq_buf_path() to be like seq_path()
tracing: Create seq_buf layer in trace_seq

Linus Torvalds
2014-12-11 12:35:41 +0800
b6da0076b Merge branch 'akpm' (patchbomb from Andrew) ... Browse Code »

Merge first patchbomb from Andrew Morton:
- a few minor cifs fixes
- dma-debug upadtes
- ocfs2
- slab
- about half of MM
- procfs
- kernel/exit.c
- panic.c tweaks
- printk upates
- lib/ updates
- checkpatch updates
- fs/binfmt updates
- the drivers/rtc tree
- nilfs
- kmod fixes
- more kernel/exit.c
- various other misc tweaks and fixes

* emailed patches from Andrew Morton : (190 commits)
exit: pidns: fix/update the comments in zap_pid_ns_processes()
exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
exit: exit_notify: re-use "dead" list to autoreap current
exit: reparent: call forget_original_parent() under tasklist_lock
exit: reparent: avoid find_new_reaper() if no children
exit: reparent: introduce find_alive_thread()
exit: reparent: introduce find_child_reaper()
exit: reparent: document the ->has_child_subreaper checks
exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper()
exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting
exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
exit: proc: don't try to flush /proc/tgid/task/tgid
exit: release_task: fix the comment about group leader accounting
exit: wait: drop tasklist_lock before psig->c* accounting
exit: wait: don't use zombie->real_parent
exit: wait: cleanup the ptrace_reparented() checks
usermodehelper: kill the kmod_thread_locker logic
usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper()
fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp
nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races
...

Linus Torvalds
2014-12-11 10:34:42 +0800
f099755d4 printk: drop logbuf_cpu volatile qualifier ... Browse Code »

Pranith Kumar posted a patch in which removed the "volatile"
qualifier for the "logbuf_cpu" variable in vprintk_emit().
https://lkml.org/lkml/2014/11/13/894
In his patch, he used ACCESS_ONCE() for all references to
that symbol to provide whatever protection was intended.

There was some discussion that followed, and in the end Steven Rostedt
concluded that not only was "volatile" not needed, neither was it
required to use ACCESS_ONCE(). I offered an elaborate description that
concluded Steven was right, and Pranith asked me to submit an
alternative patch. And this is it.

The basic reason "volatile" is not needed is that "logbuf_cpu" has
static storage duration, and vprintk_emit() is an exported
interface. This means that the value of logbuf_cpu must be read
from memory the first time it is used in a particular call of
vprintk_emit(). The variable's value is read only once in that
function, when it's read it'll be the copy from memory (or cache).

In addition, the value of "logbuf_cpu" is only ever written under
protection of a spinlock. So the value that is read is the "real"
value (and not an out-of-date cached one). If its value is not
UINT_MAX, it is the current CPU's processor id, and it will have
been last written by the running CPU.

Signed-off-by: Alex Elder
Reported-by: Pranith Kumar
Suggested-by: Steven Rostedt
Reviewed-by: Jan Kara
Cc: Petr Mladek
Cc: Luis R. Rodriguez
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Elder
2014-12-11 09:41:11 +0800
a39d4a857 printk: add and use LOGLEVEL_<level> defines for KERN_<LEVEL> equivalents ... Browse Code »

Use #defines instead of magic values.

Signed-off-by: Joe Perches
Acked-by: Greg Kroah-Hartman
Cc: Jason Baron
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-12-11 09:41:11 +0800
1dc6244bd printk: remove used-once early_vprintk ... Browse Code »

Eliminate the unlikely possibility of message interleaving for
early_printk/early_vprintk use.

early_vprintk can be done via the %pV extension so remove this
unnecessary function and change early_printk to have the equivalent
vprintk code.

All uses of early_printk already end with a newline so also remove the
unnecessary newline from the early_printk function.

Signed-off-by: Joe Perches
Acked-by: Chris Metcalf
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-12-11 09:41:10 +0800

22 Nov, 2014

1 commit

04b74b27c printk/percpu: Define printk_func when printk is not defined ... Browse Code »

To avoid include hell, the per_cpu variable printk_func was declared
in percpu.h. But it is only defined if printk is defined.

As users of printk may also use the printk_func variable, it needs to
be defined even if CONFIG_PRINTK is not.

Also add a printk.h include in percpu.h just to be safe.

Link: http://lkml.kernel.org/r/20141121183215.01ba539c@canb.auug.org.au

Reported-by: Stephen Rothwell
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2014-11-22 00:19:15 +0800

20 Nov, 2014

1 commit

afdc34a3d printk: Add per_cpu printk func to allow printk to be diverted ... Browse Code »

Being able to divert printk to call another function besides the normal
logging is useful for such things like NMI handling. If some functions
are to be called from NMI that does printk() it is possible to lock up
the box if the nmi handler triggers when another printk is happening.

One example of this use is to perform a stack trace on all CPUs via NMI.
But if the NMI is to do the printk() it can cause the system to lock up.
By allowing the printk to be diverted to another function that can safely
record the printk output and then print it when it in a safe context
then NMIs will be safe to call these functions like show_regs().

Link: http://lkml.kernel.org/p/20140619213952.209176403@goodmis.org

Tested-by: Jiri Kosina
Acked-by: Jiri Kosina
Acked-by: Paul E. McKenney
Reviewed-by: Petr Mladek
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2014-11-20 11:01:21 +0800

06 Nov, 2014

1 commit

68c4a4f8a pstore: Honor dmesg_restrict sysctl on dmesg dumps ... Browse Code »

When the kernel.dmesg_restrict restriction is in place, only users with
CAP_SYSLOG should be able to access crash dumps (like: attacker is
trying to exploit a bug, watchdog reboots, attacker can happily read
crash dumps and logs).

This puts the restriction on console-* types as well as sensitive
information could have been leaked there.

Other log types are unaffected.

Signed-off-by: Sebastian Schmidt
Acked-by: Kees Cook
Signed-off-by: Tony Luck

Sebastian Schmidt
2014-11-06 01:59:48 +0800

15 Oct, 2014

1 commit

0429fbc0b Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.

This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().

This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"

* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...

Linus Torvalds
2014-10-15 13:48:18 +0800

14 Oct, 2014

2 commits

98e35f589 printk: git rid of [sched_delayed] message for printk_deferred ... Browse Code »

Commit 458df9fd4815 ("printk: remove separate printk_sched buffers and use
printk buf instead") hardcodes printk_deferred() to KERN_WARNING and
inserts the string "[sched_delayed] " before the actual message. However
it doesn't take into account the KERN_* prefix of the message, that now
ends up in the middle of the output:

[sched_delayed] ^a4CE: hpet increased min_delta_ns to 20115 nsec

Fix this by just getting rid of the "[sched_delayed] " scnprintf(). The
prefix is useless since 458df9fd4815 anyway since from that moment
printk_deferred() inserts the message into the kernel printk buffer
immediately. So if the message eventually gets printed to console, it is
printed in the correct order with other messages and there's no need for
any special prefix. And if the kernel crashes before the message makes it
to console, then prefix in the printk buffer doesn't make the situation
any better.

Link: http://lkml.org/lkml/2014/9/14/4

Signed-off-by: Markus Trippelsdorf
Acked-by: Jan Kara
Acked-by: Steven Rostedt
Cc: Geert Uytterhoeven
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Markus Trippelsdorf
2014-10-14 08:18:13 +0800
2240a31db printk: don't bother using LOG_CPU_MAX_BUF_SHIFT on !SMP ... Browse Code »

When configuring a uniprocessor kernel, don't bother the user with an
irrelevant LOG_CPU_MAX_BUF_SHIFT question, and don't build the unused
code.

Signed-off-by: Geert Uytterhoeven
Acked-by: Luis R. Rodriguez
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2014-10-14 08:18:12 +0800

09 Oct, 2014

1 commit

849f3127b switch /dev/kmsg to ->write_iter() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-10-09 14:39:09 +0800

11 Sep, 2014

1 commit

000a7d66e kernel/printk/printk.c: fix faulty logic in the case of recursive printk ... Browse Code »

We shouldn't set text_len in the code path that detects printk recursion
because text_len corresponds to the length of the string inside textbuf.
A few lines down from the line

text_len = strlen(recursion_msg);

is the line

text_len += vscnprintf(text + text_len, ...);

So if printk detects recursion, it sets text_len to 29 (the length of
recursion_msg) and logs an error. Then the message supplied by the
caller of printk is stored inside textbuf but offset by 29 bytes. This
means that the output of the recursive call to printk will contain 29
bytes of garbage in front of it.

This defect is caused by commit 458df9fd4815 ("printk: remove separate
printk_sched buffers and use printk buf instead") which turned the line

text_len = vscnprintf(text, ...);

into

text_len += vscnprintf(text + text_len, ...);

To fix this, this patch avoids setting text_len when logging the printk
recursion error. This patch also marks unlikely() the branch leading up
to this code.

Fixes: 458df9fd4815b478 ("printk: remove separate printk_sched buffers and use printk buf instead")
Signed-off-by: Patrick Palka
Reviewed-by: Petr Mladek
Reviewed-by: Jan Kara
Acked-by: Steven Rostedt
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Patrick Palka
2014-09-11 06:42:12 +0800

27 Aug, 2014

1 commit

bb964a92c kernel misc: Replace __get_cpu_var uses ... Browse Code »

Replace uses of __get_cpu_var for address calculation with this_cpu_ptr.

Cc: akpm@linux-foundation.org
Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2014-08-27 01:45:44 +0800

13 Aug, 2014

1 commit

14c4000a8 printk: Add function to return log buffer address and size ... Browse Code »

Platforms like IBM Power Systems supports service processor
assisted dump. It provides interface to add memory region to
be captured when system is crashed.

During initialization/running we can add kernel memory region
to be collected.

Presently we don't have a way to get the log buffer base address
and size. This patch adds support to return log buffer address
and size.

Signed-off-by: Vasant Hegde
Signed-off-by: Benjamin Herrenschmidt
Acked-by: Andrew Morton

Vasant Hegde
2014-08-13 13:13:44 +0800

07 Aug, 2014

11 commits

d25d9fece kernel/printk/printk.c: fix bool assignements ... Browse Code »

Fix coccinelle warnings.

Signed-off-by: Neil Zhang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Neil Zhang
2014-08-07 09:01:24 +0800
5874af200 printk: enable interrupts before calling console_trylock_for_printk() ... Browse Code »

We need interrupts disabled when calling console_trylock_for_printk()
only so that cpu id we pass to can_use_console() remains valid (for
other things console_sem provides all the exclusion we need and
deadlocks on console_sem due to interrupts are impossible because we use
down_trylock()). However if we are rescheduled, we are guaranteed to
run on an online cpu so we can easily just get the cpu id in
can_use_console().

We can lose a bit of performance when we enable interrupts in
vprintk_emit() and then disable them again in console_unlock() but OTOH
it can somewhat reduce interrupt latency caused by console_unlock().

We differ from (reverted) commit 939f04bec1a4 in that we avoid calling
console_unlock() from vprintk_emit() with lockdep enabled as that has
unveiled quite some bugs leading to system freezes during boot (e.g.
https://lkml.org/lkml/2014/5/30/242,
https://lkml.org/lkml/2014/6/28/521).

Signed-off-by: Jan Kara
Tested-by: Andreas Bombe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2014-08-07 09:01:24 +0800
249771b83 printk: miscellaneous cleanups ... Browse Code »

Some small cleanups to kernel/printk/printk.c. None of them should
cause any change in behavior.

- When CONFIG_PRINTK is defined, parenthesize the value of LOG_LINE_MAX.
- When CONFIG_PRINTK is *not* defined, there is an extra LOG_LINE_MAX
definition; delete it.
- Pull an assignment out of a conditional expression in console_setup().
- Use isdigit() in console_setup() rather than open coding it.
- In update_console_cmdline(), drop a NUL-termination assignment;
the strlcpy() call that precedes it guarantees it's not needed.
- Simplify some logic in printk_timed_ratelimit().

Signed-off-by: Alex Elder
Reviewed-by: Petr Mladek
Cc: Andi Kleen
Cc: Borislav Petkov
Cc: Jan Kara
Cc: John Stultz
Cc: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Elder
2014-08-07 09:01:24 +0800
e99aa4616 printk: use a clever macro ... Browse Code »

Use the IS_ENABLED() macro rather than #ifdef blocks to set certain
global values.

Signed-off-by: Alex Elder
Acked-by: Borislav Petkov
Reviewed-by: Petr Mladek
Cc: Andi Kleen
Cc: Jan Kara
Cc: John Stultz
Cc: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Elder
2014-08-07 09:01:24 +0800
0b90fec3b printk: fix some comments ... Browse Code »

Fix a few comments that don't accurately describe their corresponding
code. It also fixes some minor typographical errors.

Signed-off-by: Alex Elder
Reviewed-by: Petr Mladek
Cc: Andi Kleen
Cc: Borislav Petkov
Cc: Jan Kara
Cc: John Stultz
Cc: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Elder
2014-08-07 09:01:24 +0800
42a9dc0b3 printk: rename DEFAULT_MESSAGE_LOGLEVEL ... Browse Code »

Commit a8fe19ebfbfd ("kernel/printk: use symbolic defines for console
loglevels") makes consistent use of symbolic values for printk() log
levels.

The naming scheme used is different from the one used for
DEFAULT_MESSAGE_LOGLEVEL though. Change that symbol name to be
MESSAGE_LOGLEVEL_DEFAULT for consistency. And because the value of that
symbol comes from a similarly-named config option, rename
CONFIG_DEFAULT_MESSAGE_LOGLEVEL as well.

Signed-off-by: Alex Elder
Cc: Andi Kleen
Cc: Borislav Petkov
Cc: Jan Kara
Cc: John Stultz
Cc: Petr Mladek
Cc: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Elder
2014-08-07 09:01:24 +0800
e97e1267e printk: tweak do_syslog() to match comments ... Browse Code »

In do_syslog() there's a path used by kmsg_poll() and kmsg_read() that
only needs to know whether there's any data available to read (and not
its size). These callers only check for non-zero return. As a
shortcut, do_syslog() returns the difference between what has been
logged and what has been "seen."

The comments say that the "count of records" should be returned but it's
not. Instead it returns (log_next_idx - syslog_idx), which is a
difference between buffer offsets--and the result could be negative.

The behavior is the same (it'll be zero or not in the same cases), but
the count of records is more meaningful and it matches what the comments
say. So change the code to return that.

Signed-off-by: Alex Elder
Cc: Petr Mladek
Cc: Jan Kara
Cc: Joe Perches
Cc: John Stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Elder
2014-08-07 09:01:24 +0800
23b2899f7 printk: allow increasing the ring buffer depending on the number of CPUs ... Browse Code »

The default size of the ring buffer is too small for machines with a
large amount of CPUs under heavy load. What ends up happening when
debugging is the ring buffer overlaps and chews up old messages making
debugging impossible unless the size is passed as a kernel parameter.
An idle system upon boot up will on average spew out only about one or
two extra lines but where this really matters is on heavy load and that
will vary widely depending on the system and environment.

There are mechanisms to help increase the kernel ring buffer for tracing
through debugfs, and those interfaces even allow growing the kernel ring
buffer per CPU. We also have a static value which can be passed upon
boot. Relying on debugfs however is not ideal for production, and
relying on the value passed upon bootup is can only used *after* an
issue has creeped up. Instead of being reactive this adds a proactive
measure which lets you scale the amount of contributions you'd expect to
the kernel ring buffer under load by each CPU in the worst case
scenario.

We use num_possible_cpus() to avoid complexities which could be
introduced by dynamically changing the ring buffer size at run time,
num_possible_cpus() lets us use the upper limit on possible number of
CPUs therefore avoiding having to deal with hotplugging CPUs on and off.
This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT
which is used to specify the maximum amount of contributions to the
kernel ring buffer in the worst case before the kernel ring buffer flips
over, the size is specified as a power of 2. The total amount of
contributions made by each CPU must be greater than half of the default
kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger
an increase upon bootup. The kernel ring buffer is increased to the
next power of two that would fit the required minimum kernel ring buffer
size plus the additional CPU contribution. For example if LOG_BUF_SHIFT
is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs
in order to trigger an increase of the kernel ring buffer. With a
LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64
possible CPUs to trigger an increase. If you had 128 possible CPUs the
amount of minimum required kernel ring buffer bumps to:

((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB

Since we require the ring buffer to be a power of two the new required
size would be 1024 KB.

This CPU contributions are ignored when the "log_buf_len" kernel
parameter is used as it forces the exact size of the ring buffer to an
expected power of two value.

[pmladek@suse.cz: fix build]
Signed-off-by: Luis R. Rodriguez
Signed-off-by: Petr Mladek
Tested-by: Davidlohr Bueso
Tested-by: Petr Mladek
Reviewed-by: Davidlohr Bueso
Cc: Andrew Lunn
Cc: Stephen Warren
Cc: Michal Hocko
Cc: Petr Mladek
Cc: Joe Perches
Cc: Arun KS
Cc: Kees Cook
Cc: Davidlohr Bueso
Cc: Chris Metcalf
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Luis R. Rodriguez
2014-08-07 09:01:23 +0800
f54051722 printk: make dynamic units clear for the kernel ring buffer ... Browse Code »

Signed-off-by: Luis R. Rodriguez
Suggested-by: Davidlohr Bueso
Cc: Andrew Lunn
Cc: Stephen Warren
Cc: Greg Kroah-Hartman
Cc: Michal Hocko
Cc: Petr Mladek
Cc: Joe Perches
Cc: Arun KS
Cc: Kees Cook
Cc: Davidlohr Bueso
Cc: Chris Metcalf
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Luis R. Rodriguez
2014-08-07 09:01:23 +0800
c0a318a36 printk: move power of 2 practice of ring buffer size to a helper ... Browse Code »

In practice the power of 2 practice of the size of the kernel ring
buffer remains purely historical but not a requirement, specially now
that we have LOG_ALIGN and use it for both static and dynamic
allocations. It could have helped with implicit alignment back in the
days given the even the dynamically sized ring buffer was guaranteed to
be aligned so long as CONFIG_LOG_BUF_SHIFT was set to produce a
__LOG_BUF_LEN which is architecture aligned, since log_buf_len=n would
be allowed only if it was > __LOG_BUF_LEN and we always ended up
rounding the log_buf_len=n to the next power of 2 with
roundup_pow_of_two(), any multiple of 2 then should be also architecture
aligned. These assumptions of course relied heavily on
CONFIG_LOG_BUF_SHIFT producing an aligned value but users can always
change this.

We now have precise alignment requirements set for the log buffer size
for both static and dynamic allocations, but lets upkeep the old
practice of using powers of 2 for its size to help with easy expected
scalable values and the allocators for dynamic allocations. We'll reuse
this later so move this into a helper.

Signed-off-by: Luis R. Rodriguez
Cc: Andrew Lunn
Cc: Stephen Warren
Cc: Greg Kroah-Hartman
Cc: Michal Hocko
Cc: Petr Mladek
Cc: Joe Perches
Cc: Arun KS
Cc: Kees Cook
Cc: Davidlohr Bueso
Cc: Chris Metcalf
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Luis R. Rodriguez
2014-08-07 09:01:23 +0800
703001775 printk: make dynamic kernel ring buffer alignment explicit ... Browse Code »

We have to consider alignment for the ring buffer both for the default
static size, and then also for when an dynamic allocation is made when
the log_buf_len=n kernel parameter is passed to set the size
specifically to a size larger than the default size set by the
architecture through CONFIG_LOG_BUF_SHIFT.

The default static kernel ring buffer can be aligned properly if
architectures set CONFIG_LOG_BUF_SHIFT properly, we provide ranges for
the size though so even if CONFIG_LOG_BUF_SHIFT has a sensible aligned
value it can be reduced to a non aligned value. Commit 6ebb017de9
("printk: Fix alignment of buf causing crash on ARM EABI") by Andrew
Lunn ensures the static buffer is always aligned and the decision of
alignment is done by the compiler by using __alignof__(struct log).

When log_buf_len=n is used we allocate the ring buffer dynamically.
Dynamic allocation varies, for the early allocation called before
setup_arch() memblock_virt_alloc() requests a page aligment and for the
default kernel allocation memblock_virt_alloc_nopanic() requests no
special alignment, which in turn ends up aligning the allocation to
SMP_CACHE_BYTES, which is L1 cache aligned.

Since we already have the required alignment for the kernel ring buffer
though we can do better and request explicit alignment for LOG_ALIGN.
This does that to be safe and make dynamic allocation alignment
explicit.

Signed-off-by: Luis R. Rodriguez
Tested-by: Petr Mladek
Acked-by: Petr Mladek
Cc: Andrew Lunn
Cc: Stephen Warren
Cc: Greg Kroah-Hartman
Cc: Michal Hocko
Cc: Petr Mladek
Cc: Joe Perches
Cc: Arun KS
Cc: Kees Cook
Cc: Davidlohr Bueso
Cc: Chris Metcalf
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Luis R. Rodriguez
2014-08-07 09:01:23 +0800

04 Jul, 2014

1 commit

d18bbc215 kernel/printk/printk.c: revert "printk: enable interrupts before calling console… ... Browse Code »

…_trylock_for_printk()"

Revert commit 939f04bec1a4 ("printk: enable interrupts before calling
console_trylock_for_printk()").

Andreas reported:

: None of the post 3.15 kernel boot for me. They all hang at the GRUB
: screen telling me it loaded and started the kernel, but the kernel
: itself stops before it prints anything (or even replaces the GRUB
: background graphics).

939f04bec1a4 is modest latency reduction. Revert it until we understand
the reason for these failures.

Reported-by: Andreas Bombe <aeb@debian.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Andrew Morton
2014-07-04 00:21:54 +0800

05 Jun, 2014

11 commits

a8fe19ebf kernel/printk: use symbolic defines for console loglevels ... Browse Code »
13

... instead of naked numbers.

Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.

Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.

Signed-off-by: Borislav Petkov
Acked-by: Kees Cook
Acked-by: Randy Dunlap
Cc: Joe Perches
Cc: Valdis Kletnieks
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Borislav Petkov
2014-06-05 07:54:17 +0800
84b5ec8a9 printk: report dropping of messages from logbuf ... Browse Code »

If the log ring buffer becomes full, we silently overwrite old messages
with new data. console_unlock will detect this case and fast-forward the
console_* pointers to skip over the corrupted data, but nothing will be
reported to the user.

This patch hijacks the first valid log message after detecting that we
dropped messages and prefixes it with a note detailing how many messages
were dropped. For long (~1000 char) messages, this will result in some
truncation of the real message, but given that we're dropping things
anyway, that doesn't seem to be the end of the world.

Signed-off-by: Will Deacon
Acked-by: Peter Zijlstra
Cc: Kay Sievers
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Will Deacon
2014-06-05 07:54:17 +0800
aac74dc49 printk: rename printk_sched to printk_deferred ... Browse Code »
6

After learning we'll need some sort of deferred printk functionality in
the timekeeping core, Peter suggested we rename the printk_sched function
so it can be reused by needed subsystems.

This only changes the function name. No logic changes.

Signed-off-by: John Stultz
Reviewed-by: Steven Rostedt
Cc: Jan Kara
Cc: Peter Zijlstra
Cc: Jiri Bohac
Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John Stultz
2014-06-05 07:54:17 +0800
819546062 printk: disable preemption for printk_sched ... Browse Code »

An earlier change in -mm (printk: remove separate printk_sched
buffers...), removed the printk_sched irqsave/restore lines since it was
safe for current users. Since we may be expanding usage of
printk_sched(), disable preepmtion for this function to make it more
generally safe to call.

Signed-off-by: John Stultz
Reviewed-by: Jan Kara
Cc: Peter Zijlstra
Cc: Jiri Bohac
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John Stultz
2014-06-05 07:54:17 +0800
458df9fd4 printk: remove separate printk_sched buffers and use printk buf instead ... Browse Code »
52

To prevent deadlocks with doing a printk inside the scheduler,
printk_sched() was created. The issue is that printk has a console_sem
that it can grab and release. The release does a wake up if there's a
task pending on the sem, and this wake up grabs the rq locks that is
held in the scheduler. This leads to a possible deadlock if the wake up
uses the same rq as the one with the rq lock held already.

What printk_sched() does is to save the printk write in a per cpu buffer
and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
set, the printk() is done against the buffer.

There's a couple of issues with this approach.

1) If two printk_sched()s are called before the tick, the second one
will overwrite the first one.

2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
bit of space wasted for something that is seldom used.

In order to remove this, the printk_sched() can use the printk buffer
instead, and delay the console_trylock()/console_unlock() to the queued
work.

Because printk_sched() would then be taking the logbuf_lock, the
logbuf_lock must not be held while doing anything that may call into the
scheduler functions, which includes wake ups. Unfortunately, printk()
also has a console_sem that it uses, and on release, the up(&console_sem)
may do a wake up of any pending waiters. This must be avoided while
holding the logbuf_lock.

Signed-off-by: Steven Rostedt
Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Steven Rostedt
2014-06-05 07:54:17 +0800
939f04bec printk: enable interrupts before calling console_trylock_for_printk() ... Browse Code »
77

We need interrupts disabled when calling console_trylock_for_printk()
only so that cpu id we pass to can_use_console() remains valid (for
other things console_sem provides all the exclusion we need and
deadlocks on console_sem due to interrupts are impossible because we use
down_trylock()). However if we are rescheduled, we are guaranteed to
run on an online cpu so we can easily just get the cpu id in
can_use_console().

We can lose a bit of performance when we enable interrupts in
vprintk_emit() and then disable them again in console_unlock() but OTOH
it can somewhat reduce interrupt latency caused by console_unlock()
especially since later in the patch series we will want to spin on
console_sem in console_trylock_for_printk().

Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2014-06-05 07:54:17 +0800
bd8d7cf5b printk: fix lockdep instrumentation of console_sem ... Browse Code »

Printk calls mutex_acquire() / mutex_release() by hand to instrument
lockdep about console_sem. However in some corner cases the
instrumentation is missing. Fix the problem by creating helper functions
for locking / unlocking console_sem which take care of lockdep
instrumentation as well.

Signed-off-by: Jan Kara
Reported-by: Fabio Estevam
Reported-by: Andy Shevchenko
Tested-by: Fabio Estevam
Tested-By: Valdis Kletnieks
Cc: Steven Rostedt
Cc: Peter Zijlstra
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2014-06-05 07:54:16 +0800
608873cac printk: release lockbuf_lock before calling console_trylock_for_printk() ... Browse Code »

There's no reason to hold lockbuf_lock when entering
console_trylock_for_printk().

The first thing this function does is to call down_trylock(console_sem)
and if that fails it immediately unlocks lockbuf_lock. So lockbuf_lock
isn't needed for that branch. When down_trylock() succeeds, the rest of
console_trylock() is OK without lockbuf_lock (it is called without it
from other places), and the only remaining thing in
console_trylock_for_printk() is can_use_console() call. For that call
console_sem is enough (it iterates all consoles and checks CON_ANYTIME
flag).

So we drop logbuf_lock before entering console_trylock_for_printk() which
simplifies the code.

[akpm@linux-foundation.org: fix have_callable_console() comment]
Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2014-06-05 07:54:16 +0800
ca1d432ad printk: remove outdated comment ... Browse Code »

Comment about interesting interlocking between lockbuf_lock and
console_sem is outdated.

It was added in 2002 by commit a880f45a48be during conversion of
console_lock to console_sem + lockbuf_lock.

At that time release_console_sem() (today's equivalent is
console_unlock()) was indeed using lockbuf_lock to avoid races between
trylock on console_sem in printk() and unlock of console_sem. However
these days the interlocking is gone and the races are avoided by
rechecking logbuf state after releasing console_sem.

Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2014-06-05 07:54:16 +0800
034633ccb printk: return really stored message length ... Browse Code »

I wonder if anyone uses printk return value but it is there and should be
counted correctly.

This patch modifies log_store() to return the number of really stored
bytes from the 'text' part. Also it handles the return value in
vprintk_emit().

Note that log_store() is used also in cont_flush() but we could ignore the
return value there. The function works with characters that were already
counted earlier. In addition, the store could newer fail here because the
length of the printed text is limited by the "cont" buffer and "dict" is
NULL.

Signed-off-by: Petr Mladek
Cc: Jan Kara
Cc: Jiri Kosina
Cc: Kay Sievers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Petr Mladek
2014-06-05 07:54:16 +0800
55bd53a4e printk: shrink too long messages ... Browse Code »

We might want to print at least part of too long messages and add some
warning for debugging purpose.

The question is how long the shrunken message should be. If we use the
whole buffer, it might get rotated too soon. Let's try to use only 1/4 of
the buffer for now.

Also shrink the whole dictionary. We do not want to parse it or break it
in the middle of some pair of values. It would not cause any real harm
but still.

Signed-off-by: Petr Mladek
Cc: Jan Kara
Cc: Jiri Kosina
Cc: Kay Sievers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Petr Mladek
2014-06-05 07:54:16 +0800