16 Dec, 2015

1 commit


20 Nov, 2015

2 commits


04 Aug, 2015

1 commit

  • commit d194e5d666225b04c7754471df0948f645b6ab3a upstream.

    The final version of commit 637241a900cb ("kmsg: honor dmesg_restrict
    sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are
    processed incorrectly:

    - open of /dev/kmsg checks syslog access permissions by using
    check_syslog_permissions() where security_syslog() is not called if
    dmesg_restrict is set.

    - syslog syscall and /proc/kmsg calls do_syslog() where security_syslog
    can be executed twice (inside check_syslog_permissions() and then
    directly in do_syslog())

    With this patch security_syslog() is called once only in all
    syslog-related operations regardless of dmesg_restrict value.

    Fixes: 637241a900cb ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg")
    Signed-off-by: Vasily Averin
    Cc: Kees Cook
    Cc: Josh Boyer
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Vasily Averin
     

22 Apr, 2015

1 commit

  • Pull tty/serial updates from Greg KH:
    "Here's the big tty/serial driver update for 4.1-rc1.

    It was delayed for a bit due to some questions surrounding some of the
    console command line parsing changes that are in here. There's still
    one tiny regression for people who were previously putting multiple
    console command lines and expecting them all to be ignored for some
    odd reason, but Peter is working on fixing that. If not, I'll send a
    revert for the offending patch, but I have faith that Peter can
    address it.

    Other than the console work here, there's the usual serial driver
    updates and changes, and a buch of 8250 reworks to try to make that
    driver easier to maintain over time, and have it support more devices
    in the future.

    All of these have been in linux-next for a while"

    * tag 'tty-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (119 commits)
    n_gsm: Drop unneeded cast on netdev_priv
    sc16is7xx: expose RTS inversion in RS-485 mode
    serial: 8250_pci: port failed after wakeup from S3
    earlycon: 8250: Document kernel command line options
    earlycon: 8250: Fix command line regression
    earlycon: Fix __earlycon_table stride
    tty: clean up the tty time logic a bit
    serial: 8250_dw: only get the clock rate in one place
    serial: 8250_dw: remove useless ACPI ID check
    dmaengine: hsu: move memory allocation to GFP_NOWAIT
    dmaengine: hsu: remove redundant pieces of code
    serial: 8250_pci: add Intel Tangier support
    dmaengine: hsu: add Intel Tangier PCI ID
    serial: 8250_pci: replace switch-case by formula for Intel MID
    serial: 8250_pci: replace switch-case by formula
    tty: cpm_uart: replace CONFIG_8xx by CONFIG_CPM1
    serial: jsm: some off by one bugs
    serial: xuartps: Fix check in console_setup().
    serial: xuartps: Get rid of register access macros.
    serial: xuartps: Fix iobase use.
    ...

    Linus Torvalds
     

12 Apr, 2015

1 commit


26 Mar, 2015

2 commits

  • Add match() method to struct console which allows the console to
    perform console command line matching instead of (or in addition to)
    default console matching (ie., by fixed name and index).

    The match() method returns 0 to indicate a successful match; normal
    console matching occurs if no match() method is defined or the
    match() method returns non-zero. The match() method is expected to set
    the console index if required.

    Re-implement earlycon-to-console-handoff with direct matching of
    "console=uart|uart8250,..." to the 8250 ttyS console.

    Acked-by: Rob Herring
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • struct kiocb now is a generic I/O container, so move it to fs.h.
    Also do a #include diet for aio.h while we're at it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

13 Mar, 2015

1 commit


09 Mar, 2015

1 commit


07 Mar, 2015

2 commits

  • Before register_console() calls the setup() method of the matched
    console, the registering console index is already equal to the index
    from the console command line; ie. newcon->index == c->index.

    This change is also required to support extensible console matching;
    (the command line index may have no relation to the console index
    assigned by the console-defined match() function).

    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • commit 6ae9200f2cab7 ("enlarge console.name") increased the storage
    for the console name to 16 bytes, but not the corresponding
    struct console_cmdline::name storage. Console names longer than
    8 bytes cause read beyond end-of-string and failure to match
    console; I'm not sure if there are other unexpected consequences.

    Cc: # 2.6.22+
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     

21 Feb, 2015

1 commit

  • Pull kgdb/kdb updates from Jason Wessel:
    "KGDB/KDB New:
    - KDB: improved searching
    - No longer enter debug core on panic if panic timeout is set

    KGDB/KDB regressions / cleanups
    - fix pdf doc build errors
    - prevent junk characters on kdb console from printk levels"

    * tag 'for_linux-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
    kgdb, docs: Fix pdfdocs build errors
    debug: prevent entering debug mode on panic/exception.
    kdb: Const qualifier for kdb_getstr's prompt argument
    kdb: Provide forward search at more prompt
    kdb: Fix a prompt management bug when using | grep
    kdb: Remove stack dump when entering kgdb due to NMI
    kdb: Avoid printing KERN_ levels to consoles
    kdb: Fix off by one error in kdb_cpu()
    kdb: fix incorrect counts in KDB summary command output

    Linus Torvalds
     

20 Feb, 2015

1 commit

  • Currently when kdb traps printk messages then the raw log level prefix
    (consisting of '\001' followed by a numeral) does not get stripped off
    before the message is issued to the various I/O handlers supported by
    kdb. This causes annoying visual noise as well as causing problems
    grepping for ^. It is also a change of behaviour compared to normal usage
    of printk() usage. For example -h ends up with different output to
    that of kdb's "sr h".

    This patch addresses the problem by stripping log levels from messages
    before they are issued to the I/O handlers. printk() which can also
    act as an i/o handler in some cases is special cased; if the caller
    provided a log level then the prefix will be preserved when sent to
    printk().

    The addition of non-printable characters to the output of kdb commands is a
    regression, albeit and extremely elderly one, introduced by commit
    04d2c8c83d0e ("printk: convert the format for KERN_ to a 2 byte
    pattern"). Note also that this patch does *not* restore the original
    behaviour from v3.5. Instead it makes printk() from within a kdb command
    display the message without any prefix (i.e. like printk() normally does).

    Signed-off-by: Daniel Thompson
    Cc: Joe Perches
    Cc: stable@vger.kernel.org
    Signed-off-by: Jason Wessel

    Daniel Thompson
     

13 Feb, 2015

1 commit


14 Dec, 2014

1 commit


11 Dec, 2014

6 commits

  • As printk_func will either be the default function, or a per_cpu function
    for the current CPU, there's no reason to disable preemption to access
    it from printk. That's because if the printk_func is not the default
    then the caller had better disabled preemption as they were the one to
    change it.

    Link: http://lkml.kernel.org/r/CA+55aFz5-_LKW4JHEBoWinN9_ouNcGRWAF2FUA35u46FRN-Kxw@mail.gmail.com

    Suggested-by: Linus Torvalds
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Pull nmi-safe seq_buf printk update from Steven Rostedt:
    "This code is a fork from the trace-3.19 pull as it needed the
    trace_seq clean ups from that branch.

    This code solves the issue of performing stack dumps from NMI context.
    The issue is that printk() is not safe from NMI context as if the NMI
    were to trigger when a printk() was being performed, the NMI could
    deadlock from the printk() internal locks. This has been seen in
    practice.

    With lots of review from Petr Mladek, this code went through several
    iterations, and we feel that it is now at a point of quality to be
    accepted into mainline.

    Here's what is contained in this patch set:

    - Creates a "seq_buf" generic buffer utility that allows a descriptor
    to be passed around where functions can write their own "printk()"
    formatted strings into it. The generic version was pulled out of
    the trace_seq() code that was made specifically for tracing.

    - The seq_buf code was change to model the seq_file code. I have a
    patch (not included for 3.19) that converts the seq_file.c code
    over to use seq_buf.c like the trace_seq.c code does. This was
    done to make sure that seq_buf.c is compatible with seq_file.c. I
    may try to get that patch in for 3.20.

    - The seq_buf.c file was moved to lib/ to remove it from being
    dependent on CONFIG_TRACING.

    - The printk() was updated to allow for a per_cpu "override" of the
    internal calls. That is, instead of writing to the console, a call
    to printk() may do something else. This made it easier to allow
    the NMI to change what printk() does in order to call dump_stack()
    without needing to update that code as well.

    - Finally, the dump_stack from all CPUs via NMI code was converted to
    use the seq_buf code. The caller to trigger the NMI code would
    wait till all the NMIs finished, and then it would print the
    seq_buf data to the console safely from a non NMI context

    One added bonus is that this code also makes the NMI dump stack work
    on PREEMPT_RT kernels. As printk() includes sleeping locks on
    PREEMPT_RT, printk() only writes to console if the console does not
    use any rt_mutex converted spin locks. Which a lot do"

    * tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    x86/nmi: Fix use of unallocated cpumask_var_t
    printk/percpu: Define printk_func when printk is not defined
    x86/nmi: Perform a safe NMI stack trace on all CPUs
    printk: Add per_cpu printk func to allow printk to be diverted
    seq_buf: Move the seq_buf code to lib/
    seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF
    tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions
    tracing: Have seq_buf use full buffer
    seq_buf: Add seq_buf_can_fit() helper function
    tracing: Add paranoid size check in trace_printk_seq()
    tracing: Use trace_seq_used() and seq_buf_used() instead of len
    tracing: Clean up tracing_fill_pipe_page()
    seq_buf: Create seq_buf_used() to find out how much was written
    tracing: Add a seq_buf_clear() helper and clear len and readpos in init
    tracing: Convert seq_buf fields to be like seq_file fields
    tracing: Convert seq_buf_path() to be like seq_path()
    tracing: Create seq_buf layer in trace_seq

    Linus Torvalds
     
  • Merge first patchbomb from Andrew Morton:
    - a few minor cifs fixes
    - dma-debug upadtes
    - ocfs2
    - slab
    - about half of MM
    - procfs
    - kernel/exit.c
    - panic.c tweaks
    - printk upates
    - lib/ updates
    - checkpatch updates
    - fs/binfmt updates
    - the drivers/rtc tree
    - nilfs
    - kmod fixes
    - more kernel/exit.c
    - various other misc tweaks and fixes

    * emailed patches from Andrew Morton : (190 commits)
    exit: pidns: fix/update the comments in zap_pid_ns_processes()
    exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
    exit: exit_notify: re-use "dead" list to autoreap current
    exit: reparent: call forget_original_parent() under tasklist_lock
    exit: reparent: avoid find_new_reaper() if no children
    exit: reparent: introduce find_alive_thread()
    exit: reparent: introduce find_child_reaper()
    exit: reparent: document the ->has_child_subreaper checks
    exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper()
    exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting
    exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
    exit: proc: don't try to flush /proc/tgid/task/tgid
    exit: release_task: fix the comment about group leader accounting
    exit: wait: drop tasklist_lock before psig->c* accounting
    exit: wait: don't use zombie->real_parent
    exit: wait: cleanup the ptrace_reparented() checks
    usermodehelper: kill the kmod_thread_locker logic
    usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper()
    fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp
    nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races
    ...

    Linus Torvalds
     
  • Pranith Kumar posted a patch in which removed the "volatile"
    qualifier for the "logbuf_cpu" variable in vprintk_emit().
    https://lkml.org/lkml/2014/11/13/894
    In his patch, he used ACCESS_ONCE() for all references to
    that symbol to provide whatever protection was intended.

    There was some discussion that followed, and in the end Steven Rostedt
    concluded that not only was "volatile" not needed, neither was it
    required to use ACCESS_ONCE(). I offered an elaborate description that
    concluded Steven was right, and Pranith asked me to submit an
    alternative patch. And this is it.

    The basic reason "volatile" is not needed is that "logbuf_cpu" has
    static storage duration, and vprintk_emit() is an exported
    interface. This means that the value of logbuf_cpu must be read
    from memory the first time it is used in a particular call of
    vprintk_emit(). The variable's value is read only once in that
    function, when it's read it'll be the copy from memory (or cache).

    In addition, the value of "logbuf_cpu" is only ever written under
    protection of a spinlock. So the value that is read is the "real"
    value (and not an out-of-date cached one). If its value is not
    UINT_MAX, it is the current CPU's processor id, and it will have
    been last written by the running CPU.

    Signed-off-by: Alex Elder
    Reported-by: Pranith Kumar
    Suggested-by: Steven Rostedt
    Reviewed-by: Jan Kara
    Cc: Petr Mladek
    Cc: Luis R. Rodriguez
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • Use #defines instead of magic values.

    Signed-off-by: Joe Perches
    Acked-by: Greg Kroah-Hartman
    Cc: Jason Baron
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Eliminate the unlikely possibility of message interleaving for
    early_printk/early_vprintk use.

    early_vprintk can be done via the %pV extension so remove this
    unnecessary function and change early_printk to have the equivalent
    vprintk code.

    All uses of early_printk already end with a newline so also remove the
    unnecessary newline from the early_printk function.

    Signed-off-by: Joe Perches
    Acked-by: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

22 Nov, 2014

1 commit

  • To avoid include hell, the per_cpu variable printk_func was declared
    in percpu.h. But it is only defined if printk is defined.

    As users of printk may also use the printk_func variable, it needs to
    be defined even if CONFIG_PRINTK is not.

    Also add a printk.h include in percpu.h just to be safe.

    Link: http://lkml.kernel.org/r/20141121183215.01ba539c@canb.auug.org.au

    Reported-by: Stephen Rothwell
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

20 Nov, 2014

1 commit

  • Being able to divert printk to call another function besides the normal
    logging is useful for such things like NMI handling. If some functions
    are to be called from NMI that does printk() it is possible to lock up
    the box if the nmi handler triggers when another printk is happening.

    One example of this use is to perform a stack trace on all CPUs via NMI.
    But if the NMI is to do the printk() it can cause the system to lock up.
    By allowing the printk to be diverted to another function that can safely
    record the printk output and then print it when it in a safe context
    then NMIs will be safe to call these functions like show_regs().

    Link: http://lkml.kernel.org/p/20140619213952.209176403@goodmis.org

    Tested-by: Jiri Kosina
    Acked-by: Jiri Kosina
    Acked-by: Paul E. McKenney
    Reviewed-by: Petr Mladek
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

06 Nov, 2014

1 commit

  • When the kernel.dmesg_restrict restriction is in place, only users with
    CAP_SYSLOG should be able to access crash dumps (like: attacker is
    trying to exploit a bug, watchdog reboots, attacker can happily read
    crash dumps and logs).

    This puts the restriction on console-* types as well as sensitive
    information could have been leaked there.

    Other log types are unaffected.

    Signed-off-by: Sebastian Schmidt
    Acked-by: Kees Cook
    Signed-off-by: Tony Luck

    Sebastian Schmidt
     

15 Oct, 2014

1 commit

  • Pull percpu consistent-ops changes from Tejun Heo:
    "Way back, before the current percpu allocator was implemented, static
    and dynamic percpu memory areas were allocated and handled separately
    and had their own accessors. The distinction has been gone for many
    years now; however, the now duplicate two sets of accessors remained
    with the pointer based ones - this_cpu_*() - evolving various other
    operations over time. During the process, we also accumulated other
    inconsistent operations.

    This pull request contains Christoph's patches to clean up the
    duplicate accessor situation. __get_cpu_var() uses are replaced with
    with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

    Unfortunately, the former sometimes is tricky thanks to C being a bit
    messy with the distinction between lvalues and pointers, which led to
    a rather ugly solution for cpumask_var_t involving the introduction of
    this_cpu_cpumask_var_ptr().

    This converts most of the uses but not all. Christoph will follow up
    with the remaining conversions in this merge window and hopefully
    remove the obsolete accessors"

    * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
    irqchip: Properly fetch the per cpu offset
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
    ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
    Revert "powerpc: Replace __get_cpu_var uses"
    percpu: Remove __this_cpu_ptr
    clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
    sparc: Replace __get_cpu_var uses
    avr32: Replace __get_cpu_var with __this_cpu_write
    blackfin: Replace __get_cpu_var uses
    tile: Use this_cpu_ptr() for hardware counters
    tile: Replace __get_cpu_var uses
    powerpc: Replace __get_cpu_var uses
    alpha: Replace __get_cpu_var
    ia64: Replace __get_cpu_var uses
    s390: cio driver &__get_cpu_var replacements
    s390: Replace __get_cpu_var uses
    mips: Replace __get_cpu_var uses
    MIPS: Replace __get_cpu_var uses in FPU emulator.
    arm: Replace __this_cpu_ptr with raw_cpu_ptr
    ...

    Linus Torvalds
     

14 Oct, 2014

2 commits

  • Commit 458df9fd4815 ("printk: remove separate printk_sched buffers and use
    printk buf instead") hardcodes printk_deferred() to KERN_WARNING and
    inserts the string "[sched_delayed] " before the actual message. However
    it doesn't take into account the KERN_* prefix of the message, that now
    ends up in the middle of the output:

    [sched_delayed] ^a4CE: hpet increased min_delta_ns to 20115 nsec

    Fix this by just getting rid of the "[sched_delayed] " scnprintf(). The
    prefix is useless since 458df9fd4815 anyway since from that moment
    printk_deferred() inserts the message into the kernel printk buffer
    immediately. So if the message eventually gets printed to console, it is
    printed in the correct order with other messages and there's no need for
    any special prefix. And if the kernel crashes before the message makes it
    to console, then prefix in the printk buffer doesn't make the situation
    any better.

    Link: http://lkml.org/lkml/2014/9/14/4

    Signed-off-by: Markus Trippelsdorf
    Acked-by: Jan Kara
    Acked-by: Steven Rostedt
    Cc: Geert Uytterhoeven
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Markus Trippelsdorf
     
  • When configuring a uniprocessor kernel, don't bother the user with an
    irrelevant LOG_CPU_MAX_BUF_SHIFT question, and don't build the unused
    code.

    Signed-off-by: Geert Uytterhoeven
    Acked-by: Luis R. Rodriguez
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     

09 Oct, 2014

1 commit


11 Sep, 2014

1 commit

  • We shouldn't set text_len in the code path that detects printk recursion
    because text_len corresponds to the length of the string inside textbuf.
    A few lines down from the line

    text_len = strlen(recursion_msg);

    is the line

    text_len += vscnprintf(text + text_len, ...);

    So if printk detects recursion, it sets text_len to 29 (the length of
    recursion_msg) and logs an error. Then the message supplied by the
    caller of printk is stored inside textbuf but offset by 29 bytes. This
    means that the output of the recursive call to printk will contain 29
    bytes of garbage in front of it.

    This defect is caused by commit 458df9fd4815 ("printk: remove separate
    printk_sched buffers and use printk buf instead") which turned the line

    text_len = vscnprintf(text, ...);

    into

    text_len += vscnprintf(text + text_len, ...);

    To fix this, this patch avoids setting text_len when logging the printk
    recursion error. This patch also marks unlikely() the branch leading up
    to this code.

    Fixes: 458df9fd4815b478 ("printk: remove separate printk_sched buffers and use printk buf instead")
    Signed-off-by: Patrick Palka
    Reviewed-by: Petr Mladek
    Reviewed-by: Jan Kara
    Acked-by: Steven Rostedt
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Patrick Palka
     

27 Aug, 2014

1 commit


13 Aug, 2014

1 commit

  • Platforms like IBM Power Systems supports service processor
    assisted dump. It provides interface to add memory region to
    be captured when system is crashed.

    During initialization/running we can add kernel memory region
    to be collected.

    Presently we don't have a way to get the log buffer base address
    and size. This patch adds support to return log buffer address
    and size.

    Signed-off-by: Vasant Hegde
    Signed-off-by: Benjamin Herrenschmidt
    Acked-by: Andrew Morton

    Vasant Hegde
     

07 Aug, 2014

8 commits

  • Fix coccinelle warnings.

    Signed-off-by: Neil Zhang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Zhang
     
  • We need interrupts disabled when calling console_trylock_for_printk()
    only so that cpu id we pass to can_use_console() remains valid (for
    other things console_sem provides all the exclusion we need and
    deadlocks on console_sem due to interrupts are impossible because we use
    down_trylock()). However if we are rescheduled, we are guaranteed to
    run on an online cpu so we can easily just get the cpu id in
    can_use_console().

    We can lose a bit of performance when we enable interrupts in
    vprintk_emit() and then disable them again in console_unlock() but OTOH
    it can somewhat reduce interrupt latency caused by console_unlock().

    We differ from (reverted) commit 939f04bec1a4 in that we avoid calling
    console_unlock() from vprintk_emit() with lockdep enabled as that has
    unveiled quite some bugs leading to system freezes during boot (e.g.
    https://lkml.org/lkml/2014/5/30/242,
    https://lkml.org/lkml/2014/6/28/521).

    Signed-off-by: Jan Kara
    Tested-by: Andreas Bombe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Some small cleanups to kernel/printk/printk.c. None of them should
    cause any change in behavior.

    - When CONFIG_PRINTK is defined, parenthesize the value of LOG_LINE_MAX.
    - When CONFIG_PRINTK is *not* defined, there is an extra LOG_LINE_MAX
    definition; delete it.
    - Pull an assignment out of a conditional expression in console_setup().
    - Use isdigit() in console_setup() rather than open coding it.
    - In update_console_cmdline(), drop a NUL-termination assignment;
    the strlcpy() call that precedes it guarantees it's not needed.
    - Simplify some logic in printk_timed_ratelimit().

    Signed-off-by: Alex Elder
    Reviewed-by: Petr Mladek
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • Use the IS_ENABLED() macro rather than #ifdef blocks to set certain
    global values.

    Signed-off-by: Alex Elder
    Acked-by: Borislav Petkov
    Reviewed-by: Petr Mladek
    Cc: Andi Kleen
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • Fix a few comments that don't accurately describe their corresponding
    code. It also fixes some minor typographical errors.

    Signed-off-by: Alex Elder
    Reviewed-by: Petr Mladek
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • Commit a8fe19ebfbfd ("kernel/printk: use symbolic defines for console
    loglevels") makes consistent use of symbolic values for printk() log
    levels.

    The naming scheme used is different from the one used for
    DEFAULT_MESSAGE_LOGLEVEL though. Change that symbol name to be
    MESSAGE_LOGLEVEL_DEFAULT for consistency. And because the value of that
    symbol comes from a similarly-named config option, rename
    CONFIG_DEFAULT_MESSAGE_LOGLEVEL as well.

    Signed-off-by: Alex Elder
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Petr Mladek
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • In do_syslog() there's a path used by kmsg_poll() and kmsg_read() that
    only needs to know whether there's any data available to read (and not
    its size). These callers only check for non-zero return. As a
    shortcut, do_syslog() returns the difference between what has been
    logged and what has been "seen."

    The comments say that the "count of records" should be returned but it's
    not. Instead it returns (log_next_idx - syslog_idx), which is a
    difference between buffer offsets--and the result could be negative.

    The behavior is the same (it'll be zero or not in the same cases), but
    the count of records is more meaningful and it matches what the comments
    say. So change the code to return that.

    Signed-off-by: Alex Elder
    Cc: Petr Mladek
    Cc: Jan Kara
    Cc: Joe Perches
    Cc: John Stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • The default size of the ring buffer is too small for machines with a
    large amount of CPUs under heavy load. What ends up happening when
    debugging is the ring buffer overlaps and chews up old messages making
    debugging impossible unless the size is passed as a kernel parameter.
    An idle system upon boot up will on average spew out only about one or
    two extra lines but where this really matters is on heavy load and that
    will vary widely depending on the system and environment.

    There are mechanisms to help increase the kernel ring buffer for tracing
    through debugfs, and those interfaces even allow growing the kernel ring
    buffer per CPU. We also have a static value which can be passed upon
    boot. Relying on debugfs however is not ideal for production, and
    relying on the value passed upon bootup is can only used *after* an
    issue has creeped up. Instead of being reactive this adds a proactive
    measure which lets you scale the amount of contributions you'd expect to
    the kernel ring buffer under load by each CPU in the worst case
    scenario.

    We use num_possible_cpus() to avoid complexities which could be
    introduced by dynamically changing the ring buffer size at run time,
    num_possible_cpus() lets us use the upper limit on possible number of
    CPUs therefore avoiding having to deal with hotplugging CPUs on and off.
    This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT
    which is used to specify the maximum amount of contributions to the
    kernel ring buffer in the worst case before the kernel ring buffer flips
    over, the size is specified as a power of 2. The total amount of
    contributions made by each CPU must be greater than half of the default
    kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger
    an increase upon bootup. The kernel ring buffer is increased to the
    next power of two that would fit the required minimum kernel ring buffer
    size plus the additional CPU contribution. For example if LOG_BUF_SHIFT
    is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs
    in order to trigger an increase of the kernel ring buffer. With a
    LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64
    possible CPUs to trigger an increase. If you had 128 possible CPUs the
    amount of minimum required kernel ring buffer bumps to:

    ((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB

    Since we require the ring buffer to be a power of two the new required
    size would be 1024 KB.

    This CPU contributions are ignored when the "log_buf_len" kernel
    parameter is used as it forces the exact size of the ring buffer to an
    expected power of two value.

    [pmladek@suse.cz: fix build]
    Signed-off-by: Luis R. Rodriguez
    Signed-off-by: Petr Mladek
    Tested-by: Davidlohr Bueso
    Tested-by: Petr Mladek
    Reviewed-by: Davidlohr Bueso
    Cc: Andrew Lunn
    Cc: Stephen Warren
    Cc: Michal Hocko
    Cc: Petr Mladek
    Cc: Joe Perches
    Cc: Arun KS
    Cc: Kees Cook
    Cc: Davidlohr Bueso
    Cc: Chris Metcalf
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luis R. Rodriguez