15 Oct, 2014

1 commit

  • Pull percpu consistent-ops changes from Tejun Heo:
    "Way back, before the current percpu allocator was implemented, static
    and dynamic percpu memory areas were allocated and handled separately
    and had their own accessors. The distinction has been gone for many
    years now; however, the now duplicate two sets of accessors remained
    with the pointer based ones - this_cpu_*() - evolving various other
    operations over time. During the process, we also accumulated other
    inconsistent operations.

    This pull request contains Christoph's patches to clean up the
    duplicate accessor situation. __get_cpu_var() uses are replaced with
    with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

    Unfortunately, the former sometimes is tricky thanks to C being a bit
    messy with the distinction between lvalues and pointers, which led to
    a rather ugly solution for cpumask_var_t involving the introduction of
    this_cpu_cpumask_var_ptr().

    This converts most of the uses but not all. Christoph will follow up
    with the remaining conversions in this merge window and hopefully
    remove the obsolete accessors"

    * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
    irqchip: Properly fetch the per cpu offset
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
    ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
    Revert "powerpc: Replace __get_cpu_var uses"
    percpu: Remove __this_cpu_ptr
    clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
    sparc: Replace __get_cpu_var uses
    avr32: Replace __get_cpu_var with __this_cpu_write
    blackfin: Replace __get_cpu_var uses
    tile: Use this_cpu_ptr() for hardware counters
    tile: Replace __get_cpu_var uses
    powerpc: Replace __get_cpu_var uses
    alpha: Replace __get_cpu_var
    ia64: Replace __get_cpu_var uses
    s390: cio driver &__get_cpu_var replacements
    s390: Replace __get_cpu_var uses
    mips: Replace __get_cpu_var uses
    MIPS: Replace __get_cpu_var uses in FPU emulator.
    arm: Replace __this_cpu_ptr with raw_cpu_ptr
    ...

    Linus Torvalds
     

14 Oct, 2014

2 commits

  • Commit 458df9fd4815 ("printk: remove separate printk_sched buffers and use
    printk buf instead") hardcodes printk_deferred() to KERN_WARNING and
    inserts the string "[sched_delayed] " before the actual message. However
    it doesn't take into account the KERN_* prefix of the message, that now
    ends up in the middle of the output:

    [sched_delayed] ^a4CE: hpet increased min_delta_ns to 20115 nsec

    Fix this by just getting rid of the "[sched_delayed] " scnprintf(). The
    prefix is useless since 458df9fd4815 anyway since from that moment
    printk_deferred() inserts the message into the kernel printk buffer
    immediately. So if the message eventually gets printed to console, it is
    printed in the correct order with other messages and there's no need for
    any special prefix. And if the kernel crashes before the message makes it
    to console, then prefix in the printk buffer doesn't make the situation
    any better.

    Link: http://lkml.org/lkml/2014/9/14/4

    Signed-off-by: Markus Trippelsdorf
    Acked-by: Jan Kara
    Acked-by: Steven Rostedt
    Cc: Geert Uytterhoeven
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Markus Trippelsdorf
     
  • When configuring a uniprocessor kernel, don't bother the user with an
    irrelevant LOG_CPU_MAX_BUF_SHIFT question, and don't build the unused
    code.

    Signed-off-by: Geert Uytterhoeven
    Acked-by: Luis R. Rodriguez
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     

09 Oct, 2014

1 commit


11 Sep, 2014

1 commit

  • We shouldn't set text_len in the code path that detects printk recursion
    because text_len corresponds to the length of the string inside textbuf.
    A few lines down from the line

    text_len = strlen(recursion_msg);

    is the line

    text_len += vscnprintf(text + text_len, ...);

    So if printk detects recursion, it sets text_len to 29 (the length of
    recursion_msg) and logs an error. Then the message supplied by the
    caller of printk is stored inside textbuf but offset by 29 bytes. This
    means that the output of the recursive call to printk will contain 29
    bytes of garbage in front of it.

    This defect is caused by commit 458df9fd4815 ("printk: remove separate
    printk_sched buffers and use printk buf instead") which turned the line

    text_len = vscnprintf(text, ...);

    into

    text_len += vscnprintf(text + text_len, ...);

    To fix this, this patch avoids setting text_len when logging the printk
    recursion error. This patch also marks unlikely() the branch leading up
    to this code.

    Fixes: 458df9fd4815b478 ("printk: remove separate printk_sched buffers and use printk buf instead")
    Signed-off-by: Patrick Palka
    Reviewed-by: Petr Mladek
    Reviewed-by: Jan Kara
    Acked-by: Steven Rostedt
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Patrick Palka
     

27 Aug, 2014

1 commit


13 Aug, 2014

1 commit

  • Platforms like IBM Power Systems supports service processor
    assisted dump. It provides interface to add memory region to
    be captured when system is crashed.

    During initialization/running we can add kernel memory region
    to be collected.

    Presently we don't have a way to get the log buffer base address
    and size. This patch adds support to return log buffer address
    and size.

    Signed-off-by: Vasant Hegde
    Signed-off-by: Benjamin Herrenschmidt
    Acked-by: Andrew Morton

    Vasant Hegde
     

07 Aug, 2014

11 commits

  • Fix coccinelle warnings.

    Signed-off-by: Neil Zhang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Neil Zhang
     
  • We need interrupts disabled when calling console_trylock_for_printk()
    only so that cpu id we pass to can_use_console() remains valid (for
    other things console_sem provides all the exclusion we need and
    deadlocks on console_sem due to interrupts are impossible because we use
    down_trylock()). However if we are rescheduled, we are guaranteed to
    run on an online cpu so we can easily just get the cpu id in
    can_use_console().

    We can lose a bit of performance when we enable interrupts in
    vprintk_emit() and then disable them again in console_unlock() but OTOH
    it can somewhat reduce interrupt latency caused by console_unlock().

    We differ from (reverted) commit 939f04bec1a4 in that we avoid calling
    console_unlock() from vprintk_emit() with lockdep enabled as that has
    unveiled quite some bugs leading to system freezes during boot (e.g.
    https://lkml.org/lkml/2014/5/30/242,
    https://lkml.org/lkml/2014/6/28/521).

    Signed-off-by: Jan Kara
    Tested-by: Andreas Bombe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Some small cleanups to kernel/printk/printk.c. None of them should
    cause any change in behavior.

    - When CONFIG_PRINTK is defined, parenthesize the value of LOG_LINE_MAX.
    - When CONFIG_PRINTK is *not* defined, there is an extra LOG_LINE_MAX
    definition; delete it.
    - Pull an assignment out of a conditional expression in console_setup().
    - Use isdigit() in console_setup() rather than open coding it.
    - In update_console_cmdline(), drop a NUL-termination assignment;
    the strlcpy() call that precedes it guarantees it's not needed.
    - Simplify some logic in printk_timed_ratelimit().

    Signed-off-by: Alex Elder
    Reviewed-by: Petr Mladek
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • Use the IS_ENABLED() macro rather than #ifdef blocks to set certain
    global values.

    Signed-off-by: Alex Elder
    Acked-by: Borislav Petkov
    Reviewed-by: Petr Mladek
    Cc: Andi Kleen
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • Fix a few comments that don't accurately describe their corresponding
    code. It also fixes some minor typographical errors.

    Signed-off-by: Alex Elder
    Reviewed-by: Petr Mladek
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • Commit a8fe19ebfbfd ("kernel/printk: use symbolic defines for console
    loglevels") makes consistent use of symbolic values for printk() log
    levels.

    The naming scheme used is different from the one used for
    DEFAULT_MESSAGE_LOGLEVEL though. Change that symbol name to be
    MESSAGE_LOGLEVEL_DEFAULT for consistency. And because the value of that
    symbol comes from a similarly-named config option, rename
    CONFIG_DEFAULT_MESSAGE_LOGLEVEL as well.

    Signed-off-by: Alex Elder
    Cc: Andi Kleen
    Cc: Borislav Petkov
    Cc: Jan Kara
    Cc: John Stultz
    Cc: Petr Mladek
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • In do_syslog() there's a path used by kmsg_poll() and kmsg_read() that
    only needs to know whether there's any data available to read (and not
    its size). These callers only check for non-zero return. As a
    shortcut, do_syslog() returns the difference between what has been
    logged and what has been "seen."

    The comments say that the "count of records" should be returned but it's
    not. Instead it returns (log_next_idx - syslog_idx), which is a
    difference between buffer offsets--and the result could be negative.

    The behavior is the same (it'll be zero or not in the same cases), but
    the count of records is more meaningful and it matches what the comments
    say. So change the code to return that.

    Signed-off-by: Alex Elder
    Cc: Petr Mladek
    Cc: Jan Kara
    Cc: Joe Perches
    Cc: John Stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Elder
     
  • The default size of the ring buffer is too small for machines with a
    large amount of CPUs under heavy load. What ends up happening when
    debugging is the ring buffer overlaps and chews up old messages making
    debugging impossible unless the size is passed as a kernel parameter.
    An idle system upon boot up will on average spew out only about one or
    two extra lines but where this really matters is on heavy load and that
    will vary widely depending on the system and environment.

    There are mechanisms to help increase the kernel ring buffer for tracing
    through debugfs, and those interfaces even allow growing the kernel ring
    buffer per CPU. We also have a static value which can be passed upon
    boot. Relying on debugfs however is not ideal for production, and
    relying on the value passed upon bootup is can only used *after* an
    issue has creeped up. Instead of being reactive this adds a proactive
    measure which lets you scale the amount of contributions you'd expect to
    the kernel ring buffer under load by each CPU in the worst case
    scenario.

    We use num_possible_cpus() to avoid complexities which could be
    introduced by dynamically changing the ring buffer size at run time,
    num_possible_cpus() lets us use the upper limit on possible number of
    CPUs therefore avoiding having to deal with hotplugging CPUs on and off.
    This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT
    which is used to specify the maximum amount of contributions to the
    kernel ring buffer in the worst case before the kernel ring buffer flips
    over, the size is specified as a power of 2. The total amount of
    contributions made by each CPU must be greater than half of the default
    kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger
    an increase upon bootup. The kernel ring buffer is increased to the
    next power of two that would fit the required minimum kernel ring buffer
    size plus the additional CPU contribution. For example if LOG_BUF_SHIFT
    is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs
    in order to trigger an increase of the kernel ring buffer. With a
    LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64
    possible CPUs to trigger an increase. If you had 128 possible CPUs the
    amount of minimum required kernel ring buffer bumps to:

    ((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB

    Since we require the ring buffer to be a power of two the new required
    size would be 1024 KB.

    This CPU contributions are ignored when the "log_buf_len" kernel
    parameter is used as it forces the exact size of the ring buffer to an
    expected power of two value.

    [pmladek@suse.cz: fix build]
    Signed-off-by: Luis R. Rodriguez
    Signed-off-by: Petr Mladek
    Tested-by: Davidlohr Bueso
    Tested-by: Petr Mladek
    Reviewed-by: Davidlohr Bueso
    Cc: Andrew Lunn
    Cc: Stephen Warren
    Cc: Michal Hocko
    Cc: Petr Mladek
    Cc: Joe Perches
    Cc: Arun KS
    Cc: Kees Cook
    Cc: Davidlohr Bueso
    Cc: Chris Metcalf
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luis R. Rodriguez
     
  • Signed-off-by: Luis R. Rodriguez
    Suggested-by: Davidlohr Bueso
    Cc: Andrew Lunn
    Cc: Stephen Warren
    Cc: Greg Kroah-Hartman
    Cc: Michal Hocko
    Cc: Petr Mladek
    Cc: Joe Perches
    Cc: Arun KS
    Cc: Kees Cook
    Cc: Davidlohr Bueso
    Cc: Chris Metcalf
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luis R. Rodriguez
     
  • In practice the power of 2 practice of the size of the kernel ring
    buffer remains purely historical but not a requirement, specially now
    that we have LOG_ALIGN and use it for both static and dynamic
    allocations. It could have helped with implicit alignment back in the
    days given the even the dynamically sized ring buffer was guaranteed to
    be aligned so long as CONFIG_LOG_BUF_SHIFT was set to produce a
    __LOG_BUF_LEN which is architecture aligned, since log_buf_len=n would
    be allowed only if it was > __LOG_BUF_LEN and we always ended up
    rounding the log_buf_len=n to the next power of 2 with
    roundup_pow_of_two(), any multiple of 2 then should be also architecture
    aligned. These assumptions of course relied heavily on
    CONFIG_LOG_BUF_SHIFT producing an aligned value but users can always
    change this.

    We now have precise alignment requirements set for the log buffer size
    for both static and dynamic allocations, but lets upkeep the old
    practice of using powers of 2 for its size to help with easy expected
    scalable values and the allocators for dynamic allocations. We'll reuse
    this later so move this into a helper.

    Signed-off-by: Luis R. Rodriguez
    Cc: Andrew Lunn
    Cc: Stephen Warren
    Cc: Greg Kroah-Hartman
    Cc: Michal Hocko
    Cc: Petr Mladek
    Cc: Joe Perches
    Cc: Arun KS
    Cc: Kees Cook
    Cc: Davidlohr Bueso
    Cc: Chris Metcalf
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luis R. Rodriguez
     
  • We have to consider alignment for the ring buffer both for the default
    static size, and then also for when an dynamic allocation is made when
    the log_buf_len=n kernel parameter is passed to set the size
    specifically to a size larger than the default size set by the
    architecture through CONFIG_LOG_BUF_SHIFT.

    The default static kernel ring buffer can be aligned properly if
    architectures set CONFIG_LOG_BUF_SHIFT properly, we provide ranges for
    the size though so even if CONFIG_LOG_BUF_SHIFT has a sensible aligned
    value it can be reduced to a non aligned value. Commit 6ebb017de9
    ("printk: Fix alignment of buf causing crash on ARM EABI") by Andrew
    Lunn ensures the static buffer is always aligned and the decision of
    alignment is done by the compiler by using __alignof__(struct log).

    When log_buf_len=n is used we allocate the ring buffer dynamically.
    Dynamic allocation varies, for the early allocation called before
    setup_arch() memblock_virt_alloc() requests a page aligment and for the
    default kernel allocation memblock_virt_alloc_nopanic() requests no
    special alignment, which in turn ends up aligning the allocation to
    SMP_CACHE_BYTES, which is L1 cache aligned.

    Since we already have the required alignment for the kernel ring buffer
    though we can do better and request explicit alignment for LOG_ALIGN.
    This does that to be safe and make dynamic allocation alignment
    explicit.

    Signed-off-by: Luis R. Rodriguez
    Tested-by: Petr Mladek
    Acked-by: Petr Mladek
    Cc: Andrew Lunn
    Cc: Stephen Warren
    Cc: Greg Kroah-Hartman
    Cc: Michal Hocko
    Cc: Petr Mladek
    Cc: Joe Perches
    Cc: Arun KS
    Cc: Kees Cook
    Cc: Davidlohr Bueso
    Cc: Chris Metcalf
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luis R. Rodriguez
     

04 Jul, 2014

1 commit

  • …_trylock_for_printk()"

    Revert commit 939f04bec1a4 ("printk: enable interrupts before calling
    console_trylock_for_printk()").

    Andreas reported:

    : None of the post 3.15 kernel boot for me. They all hang at the GRUB
    : screen telling me it loaded and started the kernel, but the kernel
    : itself stops before it prints anything (or even replaces the GRUB
    : background graphics).

    939f04bec1a4 is modest latency reduction. Revert it until we understand
    the reason for these failures.

    Reported-by: Andreas Bombe <aeb@debian.org>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Andrew Morton
     

05 Jun, 2014

14 commits

  • ... instead of naked numbers.

    Stuff in sysrq.c used to set it to 8 which is supposed to mean above
    default level so set it to DEBUG instead as we're terminating/killing all
    tasks and we want to be verbose there.

    Also, correct the check in x86_64_start_kernel which should be >= as
    we're clearly issuing the string there for all debug levels, not only
    the magical 10.

    Signed-off-by: Borislav Petkov
    Acked-by: Kees Cook
    Acked-by: Randy Dunlap
    Cc: Joe Perches
    Cc: Valdis Kletnieks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Borislav Petkov
     
  • If the log ring buffer becomes full, we silently overwrite old messages
    with new data. console_unlock will detect this case and fast-forward the
    console_* pointers to skip over the corrupted data, but nothing will be
    reported to the user.

    This patch hijacks the first valid log message after detecting that we
    dropped messages and prefixes it with a note detailing how many messages
    were dropped. For long (~1000 char) messages, this will result in some
    truncation of the real message, but given that we're dropping things
    anyway, that doesn't seem to be the end of the world.

    Signed-off-by: Will Deacon
    Acked-by: Peter Zijlstra
    Cc: Kay Sievers
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Will Deacon
     
  • After learning we'll need some sort of deferred printk functionality in
    the timekeeping core, Peter suggested we rename the printk_sched function
    so it can be reused by needed subsystems.

    This only changes the function name. No logic changes.

    Signed-off-by: John Stultz
    Reviewed-by: Steven Rostedt
    Cc: Jan Kara
    Cc: Peter Zijlstra
    Cc: Jiri Bohac
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     
  • An earlier change in -mm (printk: remove separate printk_sched
    buffers...), removed the printk_sched irqsave/restore lines since it was
    safe for current users. Since we may be expanding usage of
    printk_sched(), disable preepmtion for this function to make it more
    generally safe to call.

    Signed-off-by: John Stultz
    Reviewed-by: Jan Kara
    Cc: Peter Zijlstra
    Cc: Jiri Bohac
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Steven Rostedt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     
  • To prevent deadlocks with doing a printk inside the scheduler,
    printk_sched() was created. The issue is that printk has a console_sem
    that it can grab and release. The release does a wake up if there's a
    task pending on the sem, and this wake up grabs the rq locks that is
    held in the scheduler. This leads to a possible deadlock if the wake up
    uses the same rq as the one with the rq lock held already.

    What printk_sched() does is to save the printk write in a per cpu buffer
    and sets the PRINTK_PENDING_SCHED flag. On a timer tick, if this flag is
    set, the printk() is done against the buffer.

    There's a couple of issues with this approach.

    1) If two printk_sched()s are called before the tick, the second one
    will overwrite the first one.

    2) The temporary buffer is 512 bytes and is per cpu. This is a quite a
    bit of space wasted for something that is seldom used.

    In order to remove this, the printk_sched() can use the printk buffer
    instead, and delay the console_trylock()/console_unlock() to the queued
    work.

    Because printk_sched() would then be taking the logbuf_lock, the
    logbuf_lock must not be held while doing anything that may call into the
    scheduler functions, which includes wake ups. Unfortunately, printk()
    also has a console_sem that it uses, and on release, the up(&console_sem)
    may do a wake up of any pending waiters. This must be avoided while
    holding the logbuf_lock.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steven Rostedt
     
  • We need interrupts disabled when calling console_trylock_for_printk()
    only so that cpu id we pass to can_use_console() remains valid (for
    other things console_sem provides all the exclusion we need and
    deadlocks on console_sem due to interrupts are impossible because we use
    down_trylock()). However if we are rescheduled, we are guaranteed to
    run on an online cpu so we can easily just get the cpu id in
    can_use_console().

    We can lose a bit of performance when we enable interrupts in
    vprintk_emit() and then disable them again in console_unlock() but OTOH
    it can somewhat reduce interrupt latency caused by console_unlock()
    especially since later in the patch series we will want to spin on
    console_sem in console_trylock_for_printk().

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Printk calls mutex_acquire() / mutex_release() by hand to instrument
    lockdep about console_sem. However in some corner cases the
    instrumentation is missing. Fix the problem by creating helper functions
    for locking / unlocking console_sem which take care of lockdep
    instrumentation as well.

    Signed-off-by: Jan Kara
    Reported-by: Fabio Estevam
    Reported-by: Andy Shevchenko
    Tested-by: Fabio Estevam
    Tested-By: Valdis Kletnieks
    Cc: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • There's no reason to hold lockbuf_lock when entering
    console_trylock_for_printk().

    The first thing this function does is to call down_trylock(console_sem)
    and if that fails it immediately unlocks lockbuf_lock. So lockbuf_lock
    isn't needed for that branch. When down_trylock() succeeds, the rest of
    console_trylock() is OK without lockbuf_lock (it is called without it
    from other places), and the only remaining thing in
    console_trylock_for_printk() is can_use_console() call. For that call
    console_sem is enough (it iterates all consoles and checks CON_ANYTIME
    flag).

    So we drop logbuf_lock before entering console_trylock_for_printk() which
    simplifies the code.

    [akpm@linux-foundation.org: fix have_callable_console() comment]
    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Comment about interesting interlocking between lockbuf_lock and
    console_sem is outdated.

    It was added in 2002 by commit a880f45a48be during conversion of
    console_lock to console_sem + lockbuf_lock.

    At that time release_console_sem() (today's equivalent is
    console_unlock()) was indeed using lockbuf_lock to avoid races between
    trylock on console_sem in printk() and unlock of console_sem. However
    these days the interlocking is gone and the races are avoided by
    rechecking logbuf state after releasing console_sem.

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • I wonder if anyone uses printk return value but it is there and should be
    counted correctly.

    This patch modifies log_store() to return the number of really stored
    bytes from the 'text' part. Also it handles the return value in
    vprintk_emit().

    Note that log_store() is used also in cont_flush() but we could ignore the
    return value there. The function works with characters that were already
    counted earlier. In addition, the store could newer fail here because the
    length of the printed text is limited by the "cont" buffer and "dict" is
    NULL.

    Signed-off-by: Petr Mladek
    Cc: Jan Kara
    Cc: Jiri Kosina
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     
  • We might want to print at least part of too long messages and add some
    warning for debugging purpose.

    The question is how long the shrunken message should be. If we use the
    whole buffer, it might get rotated too soon. Let's try to use only 1/4 of
    the buffer for now.

    Also shrink the whole dictionary. We do not want to parse it or break it
    in the middle of some pair of values. It would not cause any real harm
    but still.

    Signed-off-by: Petr Mladek
    Cc: Jan Kara
    Cc: Jiri Kosina
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     
  • We will want to recompute the message size when shrinking too long
    messages. Let's put the code into separate function.

    The side effect of setting "pad_len" is not nice but it is worth removing
    the code duplication. Note that I will probably have one more usage for
    this function when handling messages safe way in NMI context.

    This patch does not change the existing behavior.

    Signed-off-by: Petr Mladek
    Cc: Jan Kara
    Cc: Jiri Kosina
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     
  • There was no check for too long messages. The check for free space always
    passed when first_seq and next_seq were equal. Enough free space was not
    guaranteed, though.

    log_store() might be called to store messages up to 64kB + 64kB + 16B.
    This is sum of maximal text_len, dict_len values, and the size of the
    structure printk_log.

    On the other hand, the minimal size for the main log buffer currently is
    4kB and it is enforced only by Kconfig.

    The good news is that the usage looks safe right now. log_store() is
    called only from vprintk_emit() and cont_flush(). Here the "text" part is
    always passed via a static buffer and the length is limited to
    LOG_LINE_MAX which is 1024. The "dict" part is NULL in most cases. The
    only exceptions is when vprintk_emit() is called from printk_emit() and
    dev_vprintk_emit(). But printk_emit() is currently used only in
    devkmsg_writev() and here "dict" is NULL as well. In dev_vprintk_emit(),
    "dict" is limited by the static buffer "hdr" of the size 128 bytes. It
    meas that the current maximal printed text is 1024B + 128B + 16B and it
    always fit the log buffer.

    But it is only matter of time when someone calls printk_emit() with unsafe
    parameters, especially the "dict" one.

    This patch adds a check for the free space when the buffer is empty. It
    reuses the already existing log_has_space() function but it has to add an
    extra parameter. It defines whether the buffer is empty. Note that the
    same values of "first_idx" and "next_idx" might also mean that the buffer
    is full.

    If the buffer is empty, we must respect the current position of the
    indexes. We cannot reset them to the beginning of the buffer. Otherwise,
    the functions reading the buffer would get crazy.

    The question is what to do when the message is too long. This patch uses
    the easiest solution and just ignores the problematic message. Let's do
    something better in a followup patch.

    Signed-off-by: Petr Mladek
    Cc: Jan Kara
    Cc: Jiri Kosina
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     
  • The check for free space in the log buffer always passes when "first_seq"
    and "next_seq" are equal. In theory, it might cause writing outside of
    the log buffer.

    Fortunately, the current usage looks safe because the used "text" and
    "dict" buffers are quite limited. See the second patch for more details.

    Anyway, it is better to be on the safe side and add a check. An easy
    solution is done in the 2nd patch and it is improved in the 4th patch.

    5th patch fixes the computation of the printed message length.

    1st and 3rd patches just do some code refactoring to make the other
    patches easier.

    This patch (of 5):

    There will be needed some fixes in the check for free space. They will be
    easier if the code is moved outside of the quite long log_store()
    function.

    This patch does not change the existing behavior.

    Signed-off-by: Petr Mladek
    Cc: Jan Kara
    Cc: Jiri Kosina
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     

04 Jun, 2014

1 commit

  • Pull tty/serial driver updates from Greg KH:
    "Here is the big tty / serial driver pull request for 3.16-rc1.

    A variety of different serial driver fixes and updates and additions,
    nothing huge, and no real major core tty changes at all.

    All have been in linux-next for a while"

    * tag 'tty-3.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (84 commits)
    Revert "serial: imx: remove the DMA wait queue"
    serial: kgdb_nmi: Improve console integration with KDB I/O
    serial: kgdb_nmi: Switch from tasklets to real timers
    serial: kgdb_nmi: Use container_of() to locate private data
    serial: cpm_uart: No LF conversion in put_poll_char()
    serial: sirf: Fix compilation failure
    console: Remove superfluous readonly check
    console: Use explicit pointer type for vc_uni_pagedir* fields
    vgacon: Fix & cleanup refcounting
    ARM: tty: Move HVC DCC assembly to arch/arm
    tty/hvc/hvc_console: Fix wakeup of HVC thread on hvc_kick()
    drivers/tty/n_hdlc.c: replace kmalloc/memset by kzalloc
    vt: emulate 8- and 24-bit colour codes.
    printk/of_serial: fix serial console cessation part way through boot.
    serial: 8250_dma: check the result of TX buffer mapping
    serial: uart: add hw flow control support configuration
    tty/serial: at91: add interrupts for modem control lines
    tty/serial: at91: use mctrl_gpio helpers
    tty/serial: Add GPIOLIB helpers for controlling modem lines
    ARM: at91: gpio: implement get_direction
    ...

    Linus Torvalds
     

29 May, 2014

1 commit

  • Commit 5f5c9ae56c38942623f69c3e6dc6ec78e4da2076
    "serial_core: Unregister console in uart_remove_one_port()"
    fixed a crash where a serial port was removed but
    not deregistered as a console.

    There is a side effect of that commit for platforms having serial consoles
    and of_serial configured (CONFIG_SERIAL_OF_PLATFORM). The serial console
    is disabled midway through the boot process.

    This cessation of the serial console affects PowerPC computers
    such as the MVME5100 and SAM440EP.

    The sequence is:

    bootconsole [udbg0] enabled
    ....
    serial8250/16550 driver initialises and registers its UARTS,
    one of these is the serial console.
    console [ttyS0] enabled
    ....
    of_serial probes "platform" devices, registering them as it goes.
    One of these is the serial console.
    console [ttyS0] disabled.

    The disabling of the serial console is due to:

    a. unregister_console in printk not clearing the
    CONS_ENABLED bit in the console flags,
    even though it has announced that the console is disabled; and

    b. of_platform_serial_probe in of_serial not setting the port type
    before it registers with serial8250_register_8250_port.

    This patch ensures that the serial console is re-enabled when of_serial
    registers a serial port that corresponds to the designated console.

    Signed-off-by: Stephen Chivers
    Tested-by: Stephen Chivers
    Acked-by: Geert Uytterhoeven [unregister_console]
    Cc: stable # 3.15

    ===
    The above failure was identified in Linux-3.15-rc2.

    Tested using MVME5100 and SAM440EP PowerPC computers with
    kernels built from Linux-3.15-rc5 and tty-next.

    The continued operation of the serial console is vital for computers
    such as the MVME5100 as that Single Board Computer does not
    have any grapical/display hardware.
    Signed-off-by: Greg Kroah-Hartman

    Stephen Chivers
     

06 May, 2014

1 commit


04 Apr, 2014

4 commits

  • Fix a warning about possible circular locking dependency.

    If do in following sequence:

    enter suspend -> resume -> plug-out CPUx (echo 0 > cpux/online)

    lockdep will show warning as following:

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    3.10.0 #2 Tainted: G O
    -------------------------------------------------------
    sh/1271 is trying to acquire lock:
    (console_lock){+.+.+.}, at: console_cpu_notify+0x20/0x2c
    but task is already holding lock:
    (cpu_hotplug.lock){+.+.+.}, at: cpu_hotplug_begin+0x2c/0x58
    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:
    -> #2 (cpu_hotplug.lock){+.+.+.}:
    lock_acquire+0x98/0x12c
    mutex_lock_nested+0x50/0x3d8
    cpu_hotplug_begin+0x2c/0x58
    _cpu_up+0x24/0x154
    cpu_up+0x64/0x84
    smp_init+0x9c/0xd4
    kernel_init_freeable+0x78/0x1c8
    kernel_init+0x8/0xe4
    ret_from_fork+0x14/0x2c

    -> #1 (cpu_add_remove_lock){+.+.+.}:
    lock_acquire+0x98/0x12c
    mutex_lock_nested+0x50/0x3d8
    disable_nonboot_cpus+0x8/0xe8
    suspend_devices_and_enter+0x214/0x448
    pm_suspend+0x1e4/0x284
    try_to_suspend+0xa4/0xbc
    process_one_work+0x1c4/0x4fc
    worker_thread+0x138/0x37c
    kthread+0xa4/0xb0
    ret_from_fork+0x14/0x2c

    -> #0 (console_lock){+.+.+.}:
    __lock_acquire+0x1b38/0x1b80
    lock_acquire+0x98/0x12c
    console_lock+0x54/0x68
    console_cpu_notify+0x20/0x2c
    notifier_call_chain+0x44/0x84
    __cpu_notify+0x2c/0x48
    cpu_notify_nofail+0x8/0x14
    _cpu_down+0xf4/0x258
    cpu_down+0x24/0x40
    store_online+0x30/0x74
    dev_attr_store+0x18/0x24
    sysfs_write_file+0x16c/0x19c
    vfs_write+0xb4/0x190
    SyS_write+0x3c/0x70
    ret_fast_syscall+0x0/0x48

    Chain exists of:
    console_lock --> cpu_add_remove_lock --> cpu_hotplug.lock

    Possible unsafe locking scenario:
    CPU0 CPU1
    ---- ----
    lock(cpu_hotplug.lock);
    lock(cpu_add_remove_lock);
    lock(cpu_hotplug.lock);
    lock(console_lock);
    *** DEADLOCK ***

    There are three locks involved in two sequence:
    a) pm suspend:
    console_lock (@suspend_console())
    cpu_add_remove_lock (@disable_nonboot_cpus())
    cpu_hotplug.lock (@_cpu_down())
    b) Plug-out CPUx:
    cpu_add_remove_lock (@(cpu_down())
    cpu_hotplug.lock (@_cpu_down())
    console_lock (@console_cpu_notify()) => Lockdeps prints warning log.

    There should be not real deadlock, as flag of console_suspended can
    protect this.

    Although console_suspend() releases console_sem, it doesn't tell lockdep
    about it. That results in the lockdep warning about circular locking
    when doing the following: enter suspend -> resume -> plug-out CPUx (echo
    0 > cpux/online)

    Fix the problem by telling lockdep we actually released the semaphore in
    console_suspend() and acquired it again in console_resume().

    Signed-off-by: Jane Li
    Reviewed-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jane Li
     
  • This is just a tiny optimization. It removes duplicate computation of
    the message size.

    Signed-off-by: Petr Mladek
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Jan Kara
    Cc: Michal Hocko
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     
  • It seems that we have newer used the last byte in the ring buffer. In
    fact, we have newer used the last 4 bytes because of padding.

    First problem is in the check for free space. The exact number of free
    bytes is enough to store the length of data.

    Second problem is in the check where the ring buffer is rotated. The
    left side counts the first unused index. It is unused, so it might be
    the same as the size of the buffer.

    Note that the first problem has to be fixed together with the second
    one. Otherwise, the buffer is rotated even when there is enough space
    on the end of the buffer. Then the beginning of the buffer is rewritten
    and valid entries get corrupted.

    Signed-off-by: Petr Mladek
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Jan Kara
    Cc: Michal Hocko
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek
     
  • There is no check for potential "text_len" overflow. It is not needed
    because only valid level is detected. It took me some time to
    understand why. It would deserve a comment ;-)

    Signed-off-by: Petr Mladek
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Jan Kara
    Cc: Michal Hocko
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Petr Mladek