28 Nov, 2020

1 commit


27 Nov, 2020

2 commits

  • Petr Mladek
     
  • Any record with a trailing newline (LOG_NEWLINE flag) cannot
    be continued because the newline has been stripped and will
    not be visible if the message is appended. This was already
    handled correctly when committing in log_output() but was
    not handled correctly when committing in log_store().

    Fixes: f5f022e53b87 ("printk: reimplement log_cont using record extension")
    Link: https://lore.kernel.org/r/20201126114836.14750-1-john.ogness@linutronix.de
    Reported-by: Kefeng Wang
    Signed-off-by: John Ogness
    Tested-by: Kefeng Wang
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek

    John Ogness
     

06 Nov, 2020

1 commit

  • make clang-analyzer on x86_64 defconfig caught my attention with:

    kernel/printk/printk_ringbuffer.c:885:3: warning:
    Value stored to 'desc' is never read [clang-analyzer-deadcode.DeadStores]

    desc = to_desc(desc_ring, head_id);
    ^

    Commit b6cf8b3f3312 ("printk: add lockless ringbuffer") introduced
    desc_reserve() with this unneeded dead-store assignment.

    As discussed with John Ogness privately, this is probably just some minor
    left-over from previous iterations of the ringbuffer implementation. So,
    simply remove this unneeded dead assignment to make clang-analyzer happy.

    As compilers will detect this unneeded assignment and optimize this anyway,
    the resulting object code is identical before and after this change.

    No functional change. No change to object code.

    Signed-off-by: Lukas Bulwahn
    Reviewed-by: Sergey Senozhatsky
    Reviewed-by: John Ogness
    Reviewed-by: Nathan Chancellor
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20201106034005.18822-1-lukas.bulwahn@gmail.com

    Lukas Bulwahn
     

31 Oct, 2020

1 commit

  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code should
    always use “flexible array members”[1] for these cases. The older style of
    one-element or zero-length arrays should no longer be used[2].

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

17 Oct, 2020

1 commit


16 Oct, 2020

1 commit

  • Pull trivial updates from Jiri Kosina:
    "The latest advances in computer science from the trivial queue"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    xtensa: fix Kconfig typo
    spelling.txt: Remove some duplicate entries
    mtd: rawnand: oxnas: cleanup/simplify code
    selftests: vm: add fragment CONFIG_GUP_BENCHMARK
    perf: Fix opt help text for --no-bpf-event
    HID: logitech-dj: Fix spelling in comment
    bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG
    MAINTAINERS: rectify MMP SUPPORT after moving cputype.h
    scif: Fix spelling of EACCES
    printk: fix global comment
    lib/bitmap.c: fix spello
    fs: Fix missing 'bit' in comment

    Linus Torvalds
     

15 Oct, 2020

1 commit

  • data_realloc() returns wrong data pointer when the block is wrapped and
    the size is not increased. It might happen when pr_cont() wants to
    add only few characters and there is already a space for them because
    of alignment.

    It might cause writing outsite the buffer. It has been detected by LTP
    tests with KASAN enabled:

    [ 221.921944] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=c,mems_allowed=0,oom_memcg=/0,task_memcg=in
    [ 221.922108] ==================================================================
    [ 221.922111] BUG: KASAN: global-out-of-bounds in vprintk_store+0x362/0x3d0
    [ 221.922112] Write of size 2 at addr ffffffffba51dbcd by task
    memcg_test_1/11282
    [ 221.922113]
    [ 221.922114] CPU: 1 PID: 11282 Comm: memcg_test_1 Not tainted
    5.9.0-next-20201013 #1
    [ 221.922116] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
    2.0b 07/27/2017
    [ 221.922116] Call Trace:
    [ 221.922117] dump_stack+0xa4/0xd9
    [ 221.922118] print_address_description.constprop.0+0x21/0x210
    [ 221.922119] ? _raw_write_lock_bh+0xe0/0xe0
    [ 221.922120] ? vprintk_store+0x362/0x3d0
    [ 221.922121] kasan_report.cold+0x37/0x7c
    [ 221.922122] ? vprintk_store+0x362/0x3d0
    [ 221.922123] check_memory_region+0x18c/0x1f0
    [ 221.922124] memcpy+0x3c/0x60
    [ 221.922125] vprintk_store+0x362/0x3d0
    [ 221.922125] ? __ia32_sys_syslog+0x50/0x50
    [ 221.922126] ? _raw_spin_lock_irqsave+0x9b/0x100
    [ 221.922127] ? _raw_spin_lock_irq+0xf0/0xf0
    [ 221.922128] ? __kasan_check_write+0x14/0x20
    [ 221.922129] vprintk_emit+0x8d/0x1f0
    [ 221.922130] vprintk_default+0x1d/0x20
    [ 221.922131] vprintk_func+0x5a/0x100
    [ 221.922132] printk+0xb2/0xe3
    [ 221.922133] ? swsusp_write.cold+0x189/0x189
    [ 221.922134] ? kernfs_vfs_xattr_set+0x60/0x60
    [ 221.922134] ? _raw_write_lock_bh+0xe0/0xe0
    [ 221.922135] ? trace_hardirqs_on+0x38/0x100
    [ 221.922136] pr_cont_kernfs_path.cold+0x49/0x4b
    [ 221.922137] mem_cgroup_print_oom_context.cold+0x74/0xc3
    [ 221.922138] dump_header+0x340/0x3bf
    [ 221.922139] oom_kill_process.cold+0xb/0x10
    [ 221.922140] out_of_memory+0x1e9/0x860
    [ 221.922141] ? oom_killer_disable+0x210/0x210
    [ 221.922142] mem_cgroup_out_of_memory+0x198/0x1c0
    [ 221.922143] ? mem_cgroup_count_precharge_pte_range+0x250/0x250
    [ 221.922144] try_charge+0xa9b/0xc50
    [ 221.922145] ? arch_stack_walk+0x9e/0xf0
    [ 221.922146] ? memory_high_write+0x230/0x230
    [ 221.922146] ? avc_has_extended_perms+0x830/0x830
    [ 221.922147] ? stack_trace_save+0x94/0xc0
    [ 221.922148] ? stack_trace_consume_entry+0x90/0x90
    [ 221.922149] __memcg_kmem_charge+0x73/0x120
    [ 221.922150] ? cred_has_capability+0x10f/0x200
    [ 221.922151] ? mem_cgroup_can_attach+0x260/0x260
    [ 221.922152] ? selinux_sb_eat_lsm_opts+0x2f0/0x2f0
    [ 221.922153] ? obj_cgroup_charge+0x16b/0x220
    [ 221.922154] ? kmem_cache_alloc+0x78/0x4c0
    [ 221.922155] obj_cgroup_charge+0x122/0x220
    [ 221.922156] ? vm_area_alloc+0x20/0x90
    [ 221.922156] kmem_cache_alloc+0x78/0x4c0
    [ 221.922157] vm_area_alloc+0x20/0x90
    [ 221.922158] mmap_region+0x3ed/0x9a0
    [ 221.922159] ? cap_mmap_addr+0x1d/0x80
    [ 221.922160] do_mmap+0x3ee/0x720
    [ 221.922161] vm_mmap_pgoff+0x16a/0x1c0
    [ 221.922162] ? randomize_stack_top+0x90/0x90
    [ 221.922163] ? copy_page_range+0x1980/0x1980
    [ 221.922163] ksys_mmap_pgoff+0xab/0x350
    [ 221.922164] ? find_mergeable_anon_vma+0x110/0x110
    [ 221.922165] ? __audit_syscall_entry+0x1a6/0x1e0
    [ 221.922166] __x64_sys_mmap+0x8d/0xb0
    [ 221.922167] do_syscall_64+0x38/0x50
    [ 221.922168] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 221.922169] RIP: 0033:0x7fe8f5e75103
    [ 221.922172] Code: 54 41 89 d4 55 48 89 fd 53 4c 89 cb 48 85 ff 74
    56 49 89 d9 45 89 f8 45 89 f2 44 89 e2 4c 89 ee 48 89 ef b8 09 00 00
    00 0f 05 3d 00 f0 ff ff 77 7d 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66
    2e 0f
    [ 221.922173] RSP: 002b:00007ffd38c90198 EFLAGS: 00000246 ORIG_RAX:
    0000000000000009
    [ 221.922175] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe8f5e75103
    [ 221.922176] RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000000000
    [ 221.922178] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    [ 221.922179] R10: 0000000000002022 R11: 0000000000000246 R12: 0000000000000003
    [ 221.922180] R13: 0000000000001000 R14: 0000000000002022 R15: 0000000000000000
    [ 221.922181]
    [ 213O[ 221.922182] The buggy address belongs to the variable:
    [ 221.922183] clear_seq+0x2d/0x40
    [ 221.922183]
    [ 221.922184] Memory state around the buggy address:
    [ 221.922185] ffffffffba51da80: 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00
    [ 221.922187] ffffffffba51db00: 00 00 00 00 00 00 00 00 00 00 00 00
    00 00 00 00
    [ 221.922188] >ffffffffba51db80: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
    00 f9 f9 f9
    [ 221.922189] ^
    [ 221.922190] ffffffffba51dc00: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9
    00 f9 f9 f9
    [ 221.922191] ffffffffba51dc80: f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9
    00 f9 f9 f9
    [ 221.922193] ==================================================================
    [ 221.922194] Disabling lock debugging due to kernel taint
    [ 221.922196] ,task=memcg_test_1,pid=11280,uid=0
    [ 221.922205] Memory cgroup out of memory: Killed process 11280

    Link: https://lore.kernel.org/r/CA+G9fYt46oC7-BKryNDaaXPJ9GztvS2cs_7GjYRjanRi4+ryCQ@mail.gmail.com
    Fixes: 4cfc7258f876a7feba673ac ("printk: ringbuffer: add finalization/extension support")
    Reported-by: Naresh Kamboju
    Reviewed-by: John Ogness
    Acked-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20201014175051.GC13775@alley

    Petr Mladek
     

12 Oct, 2020

1 commit


05 Oct, 2020

1 commit

  • Replace /* FALL THRU */ comment with the new pseudo-keyword macro
    fallthrough[1].

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20201002224627.GA30475@embeddedor

    Gustavo A. R. Silva
     

30 Sep, 2020

2 commits

  • @setup_text_buf only copies the original text messages (without any
    prefix or extended text). It only needs to be LOG_LINE_MAX in size.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200930090134.8723-3-john.ogness@linutronix.de

    John Ogness
     
  • If a reader provides a buffer that is smaller than the message text,
    the @text_len field of @info will have a value larger than the buffer
    size. If readers blindly read @text_len bytes of data without
    checking the size, they will read beyond their buffer.

    Add this check to record_print_text() to properly recognize when such
    truncation has occurred.

    Add a maximum size argument to the ringbuffer function to extend
    records so that records can not be created that are larger than the
    buffer size of readers.

    When extending records (LOG_CONT), do not extend records beyond
    LOG_LINE_MAX since that is the maximum size available in the buffers
    used by consoles and syslog.

    Fixes: f5f022e53b87 ("printk: reimplement log_cont using record extension")
    Reported-by: Marek Szyprowski
    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200930090134.8723-2-john.ogness@linutronix.de

    John Ogness
     

22 Sep, 2020

3 commits

  • Since there is no code that will ever store anything into the dict
    ring, remove it. If any future dictionary properties are to be
    added, these should be added to the struct printk_info.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200918223421.21621-4-john.ogness@linutronix.de

    John Ogness
     
  • Dictionaries are only used for SUBSYSTEM and DEVICE properties. The
    current implementation stores the property names each time they are
    used. This requires more space than otherwise necessary. Also,
    because the dictionary entries are currently considered optional,
    it cannot be relied upon that they are always available, even if the
    writer wanted to store them. These issues will increase should new
    dictionary properties be introduced.

    Rather than storing the subsystem and device properties in the
    dict ring, introduce a struct dev_printk_info with separate fields
    to store only the property values. Embed this struct within the
    struct printk_info to provide guaranteed availability.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/87mu1jl6ne.fsf@jogness.linutronix.de

    John Ogness
     
  • The majority of the size of a descriptor is taken up by meta data,
    which is often not of interest to the ringbuffer (for example,
    when performing state checks). Since descriptors are often
    temporarily stored on the stack, keeping their size minimal will
    help reduce stack pressure.

    Rather than embedding the printk_info into the descriptor, create
    a separate printk_info array. The index of a descriptor in the
    descriptor array corresponds to the printk_info with the same
    index in the printk_info array. The rules for validity of a
    printk_info match the existing rules for the data blocks: the
    descriptor must be in a consistent state.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200918223421.21621-2-john.ogness@linutronix.de

    John Ogness
     

15 Sep, 2020

8 commits

  • Use the record extending feature of the ringbuffer to implement
    continuous messages. This preserves the existing continuous message
    behavior.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914123354.832-7-john.ogness@linutronix.de

    John Ogness
     
  • Add support for extending the newest data block. For this, introduce
    a new finalization state (desc_finalized) denoting a committed
    descriptor that cannot be extended.

    Until a record is finalized, a writer can reopen that record to
    append new data. Reopening a record means transitioning from the
    desc_committed state back to the desc_reserved state.

    A writer can explicitly finalize a record if there is no intention
    of extending it. Also, records are automatically finalized when a
    new record is reserved. This relieves writers of needing to
    explicitly finalize while also making such records available to
    readers sooner. (Readers can only traverse finalized records.)

    Four new memory barrier pairs are introduced. Two of them are
    insignificant additions (data_realloc:A/desc_read:D and
    data_realloc:A/data_push_tail:B) because they are alternate path
    memory barriers that exactly match the purpose, pairing, and
    context of the two existing memory barrier pairs they provide an
    alternate path for. The other two new memory barrier pairs are
    significant additions:

    desc_reopen_last:A / _prb_commit:B - When reopening a descriptor,
    ensure the state transitions back to desc_reserved before
    fully trusting the descriptor data.

    _prb_commit:B / desc_reserve:D - When committing a descriptor,
    ensure the state transitions to desc_committed before checking
    the head ID to see if the descriptor needs to be finalized.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914123354.832-6-john.ogness@linutronix.de

    John Ogness
     
  • Rather than deriving the state by evaluating bits within the flags
    area of the state variable, assign the states explicit values and
    set those values in the flags area. Introduce macros to make it
    simple to read and write state values for the state variable.

    Although the functionality is preserved, the binary representation
    for the states is changed.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914123354.832-5-john.ogness@linutronix.de

    John Ogness
     
  • prb_reserve() will set some meta data values and leave others
    uninitialized (or rather, containing the values of the previous
    wrap). Simplify the API by always clearing out all the fields.
    Only the sequence number is filled in. The caller is now
    responsible for filling in the rest of the meta data fields.
    In particular, for correctly filling in text and dict lengths.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914123354.832-4-john.ogness@linutronix.de

    John Ogness
     
  • Rather than continually needing to explicitly check @begin and @next
    to identify a dataless block, introduce and use a BLK_DATALESS()
    macro.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914123354.832-3-john.ogness@linutronix.de

    John Ogness
     
  • Move the internal get_data() function as-is above prb_reserve() so
    that a later change can make use of the static function.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914123354.832-2-john.ogness@linutronix.de

    John Ogness
     
  • @state_var is copied as part of the descriptor copying via
    memcpy(). This is not allowed because @state_var is an atomic type,
    which in some implementations may contain a spinlock.

    Avoid using memcpy() with @state_var by explicitly copying the other
    fields of the descriptor. @state_var is set using atomic set
    operator before returning.

    Fixes: b6cf8b3f3312 ("printk: add lockless ringbuffer")
    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914094803.27365-2-john.ogness@linutronix.de

    John Ogness
     
  • It is expected that desc_read() will always set at least the
    @state_var field. However, if the descriptor is in an inconsistent
    state, no fields are set.

    Also, the second load of @state_var is not stored in @desc_out and
    so might not match the state value that is returned.

    Always set the last loaded @state_var into @desc_out, regardless of
    the descriptor consistency.

    Fixes: b6cf8b3f3312 ("printk: add lockless ringbuffer")
    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200914094803.27365-1-john.ogness@linutronix.de

    John Ogness
     

08 Sep, 2020

1 commit

  • With commit 896fbe20b4e2333fb55 ("printk: use the lockless ringbuffer"),
    printk() started silently dropping messages without text because such
    records are not supported by the new printk ringbuffer.

    Add support for such records.

    Currently dataless records are denoted by INVALID_LPOS in order
    to recognize failed prb_reserve() calls. Change the ringbuffer
    to instead use two different identifiers (FAILED_LPOS and
    NO_LPOS) to distinguish between failed prb_reserve() records and
    successful dataless records, respectively.

    Fixes: 896fbe20b4e2333fb55 ("printk: use the lockless ringbuffer")
    Fixes: https://lkml.kernel.org/r/20200718121053.GA691245@elver.google.com
    Reported-by: Marco Elver
    Signed-off-by: John Ogness
    Cc: Petr Mladek
    Cc: Steven Rostedt
    Cc: Marco Elver
    Signed-off-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200721132528.9661-1-john.ogness@linutronix.de

    John Ogness
     

01 Sep, 2020

1 commit


10 Aug, 2020

1 commit


05 Aug, 2020

1 commit

  • Pull printk updates from Petr Mladek:

    - Herbert Xu made printk header file self-contained.

    - Andy Shevchenko and Sergey Senozhatsky cleaned up console->setup()
    error handling.

    - Andy Shevchenko did some cleanups (e.g. sparse warning) in vsprintf
    code.

    - Minor documentation updates.

    * tag 'printk-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
    lib/vsprintf: Force type of flags value for gfp_t
    lib/vsprintf: Replace custom spec to print decimals with generic one
    lib/vsprintf: Replace hidden BUILD_BUG_ON() with static_assert()
    printk: Make linux/printk.h self-contained
    doc:kmsg: explicitly state the return value in case of SEEK_CUR
    Replace HTTP links with HTTPS ones: vsprintf
    hvc: unify console setup naming
    console: Fix trivia typo 'change' -> 'chance'
    console: Propagate error code from console ->setup()
    tty: hvc: Return proper error code from console ->setup() hook
    serial: sunzilog: Return proper error code from console ->setup() hook
    serial: sunsab: Return proper error code from console ->setup() hook
    mips: Return proper error code from console ->setup() hook

    Linus Torvalds
     

04 Aug, 2020

1 commit


13 Jul, 2020

1 commit

  • The commit 625d3449788f ("Revert "kernel/printk: add kmsg SEEK_CUR
    handling"") reverted a change done to the return value in case a SEEK_CUR
    operation was performed for kmsg buffer based on the fact that different
    userspace apps were handling the new return value (-ESPIPE) in different
    ways, breaking them.

    At the same time -ESPIPE was the wrong decision because kmsg /does support/
    seek() but doesn't follow the "normal" behavior userspace is used to.
    Because of that and also considering the time -EINVAL has been used, it was
    decided to keep this way to avoid more userspace breakage.

    This patch adds an official statement to the kmsg documentation pointing to
    the current return value for SEEK_CUR, -EINVAL, thus userspace libraries
    and apps can refer to it for a definitive guide on what to expect.

    Signed-off-by: Bruno Meneguele
    Reviewed-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200710174423.10480-1-bmeneg@redhat.com

    Bruno Meneguele
     

10 Jul, 2020

3 commits

  • Replace the existing ringbuffer usage and implementation with
    lockless ringbuffer usage. Even though the new ringbuffer does not
    require locking, all existing locking is left in place. Therefore,
    this change is purely replacing the underlining ringbuffer.

    Changes that exist due to the ringbuffer replacement:

    - The VMCOREINFO has been updated for the new structures.

    - Dictionary data is now stored in a separate data buffer from the
    human-readable messages. The dictionary data buffer is set to the
    same size as the message buffer. Therefore, the total required
    memory for both dictionary and message data is
    2 * (2 ^ CONFIG_LOG_BUF_SHIFT) for the initial static buffers and
    2 * log_buf_len (the kernel parameter) for the dynamic buffers.

    - Record meta-data is now stored in a separate array of descriptors.
    This is an additional 72 * (2 ^ (CONFIG_LOG_BUF_SHIFT - 5)) bytes
    for the static array and 72 * (log_buf_len >> 5) bytes for the
    dynamic array.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200709132344.760-5-john.ogness@linutronix.de

    John Ogness
     
  • This reverts commit 3ac37a93fa9217e576bebfd4ba3e80edaaeb2289.

    This optimization will not apply once the transition to a lockless
    printk is complete. Rather than porting this optimization through
    the transition only to remove it anyway, just revert it now to
    simplify the transition.

    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Acked-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200709132344.760-4-john.ogness@linutronix.de

    John Ogness
     
  • Introduce a multi-reader multi-writer lockless ringbuffer for storing
    the kernel log messages. Readers and writers may use their API from
    any context (including scheduler and NMI). This ringbuffer will make
    it possible to decouple printk() callers from any context, locking,
    or console constraints. It also makes it possible for readers to have
    full access to the ringbuffer contents at any time and context (for
    example from any panic situation).

    The printk_ringbuffer is made up of 3 internal ringbuffers:

    desc_ring:
    A ring of descriptors. A descriptor contains all record meta data
    (sequence number, timestamp, loglevel, etc.) as well as internal state
    information about the record and logical positions specifying where in
    the other ringbuffers the text and dictionary strings are located.

    text_data_ring:
    A ring of data blocks. A data block consists of an unsigned long
    integer (ID) that maps to a desc_ring index followed by the text
    string of the record.

    dict_data_ring:
    A ring of data blocks. A data block consists of an unsigned long
    integer (ID) that maps to a desc_ring index followed by the dictionary
    string of the record.

    The internal state information of a descriptor is the key element to
    allow readers and writers to locklessly synchronize access to the data.

    Co-developed-by: Petr Mladek
    Signed-off-by: John Ogness
    Reviewed-by: Petr Mladek
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200709132344.760-3-john.ogness@linutronix.de

    John Ogness
     

25 Jun, 2020

2 commits

  • I bet the word 'chance' has to be used in 'had a chance to be called',
    but, alas, I'm not native speaker...

    Signed-off-by: Andy Shevchenko
    Reviewed-by: Petr Mladek
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200618164751.56828-7-andriy.shevchenko@linux.intel.com

    Andy Shevchenko
     
  • Since console ->setup() hook returns meaningful error codes,
    propagate it to the caller of try_enable_new_console().

    Signed-off-by: Andy Shevchenko
    Reviewed-by: Petr Mladek
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200618164751.56828-6-andriy.shevchenko@linux.intel.com

    Andy Shevchenko
     

22 Jun, 2020

1 commit

  • This reverts commit 8ece3b3eb576a78d2e67ad4c3a80a39fa6708809.

    This commit broke userspace. Bash uses ESPIPE to determine whether or
    not the file should be read using "unbuffered I/O", which means reading
    1 byte at a time instead of 128 bytes at a time. I used to use bash to
    read through kmsg in a really quite nasty way:

    while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do
    echo "SARU $line"
    done < /dev/kmsg

    This will show all lines that can fit into the 128 byte buffer, and skip
    lines that don't. That's pretty awful, but at least it worked.

    With this change, bash now tries to do 1-byte reads, which means it
    skips all the lines, which is worse than before.

    Now, I don't really care very much about this, and I'm already look for
    a workaround. But I did just spend an hour trying to figure out why my
    scripts were broken. Either way, it makes no difference to me personally
    whether this is reverted, but it might be something to consider. If you
    declare that "trying to read /dev/kmsg with bash is terminally stupid
    anyway," I might be inclined to agree with you. But do note that bash
    uses lseek(fd, 0, SEEK_CUR)==>ESPIPE to determine whether or not it's
    reading from a pipe.

    Cc: Bruno Meneguele
    Cc: Sergey Senozhatsky
    Cc: Steven Rostedt
    Cc: David Laight
    Cc: Sergey Senozhatsky
    Cc: Petr Mladek
    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: Linus Torvalds

    Jason A. Donenfeld
     

13 Jun, 2020

1 commit

  • Pull printk fix from Petr Mladek:
    "One more printk change for 5.8: make sure that messages printed from
    KDB context are redirected to KDB console handlers. It did not work
    when KDB interrupted NMI or printk_safe contexts.

    Arm people started hitting this problem more often recently. I forgot
    to add the fix into the previous pull request by mistake"

    * tag 'printk-for-5.8-kdb-nmi' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
    printk/kdb: Redirect printk messages into kdb in any context

    Linus Torvalds
     

11 Jun, 2020

1 commit

  • kdb has to get messages on consoles even when the system is stopped.
    It uses kdb_printf() internally and calls console drivers on its own.

    It uses a hack to reuse an existing code. It sets "kdb_trap_printk"
    global variable to redirect even the normal printk() into the
    kdb_printf() variant.

    The variable "kdb_trap_printk" is checked in printk_default() and
    it is ignored when printk is redirected to printk_safe in NMI context.
    Solve this by moving the check into printk_func().

    It is obvious that it is not fully safe. But it does not make things
    worse. The console drivers are already called in this context by
    db_printf() direct calls.

    Reported-by: Sumit Garg
    Tested-by: Sumit Garg
    Reviewed-by: Daniel Thompson
    Acked-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200520102233.GC3464@linux-b0ei

    Petr Mladek
     

04 Jun, 2020

1 commit

  • Pull networking updates from David Miller:

    1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
    Augusto von Dentz.

    2) Add GSO partial support to igc, from Sasha Neftin.

    3) Several cleanups and improvements to r8169 from Heiner Kallweit.

    4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
    device self-test. From Andrew Lunn.

    5) Start moving away from custom driver versions, use the globally
    defined kernel version instead, from Leon Romanovsky.

    6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

    7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

    8) Add sriov and vf support to hinic, from Luo bin.

    9) Support Media Redundancy Protocol (MRP) in the bridging code, from
    Horatiu Vultur.

    10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

    11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
    Dubroca. Also add ipv6 support for espintcp.

    12) Lots of ReST conversions of the networking documentation, from Mauro
    Carvalho Chehab.

    13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
    from Doug Berger.

    14) Allow to dump cgroup id and filter by it in inet_diag code, from
    Dmitry Yakunin.

    15) Add infrastructure to export netlink attribute policies to
    userspace, from Johannes Berg.

    16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

    17) Fallback to the default qdisc if qdisc init fails because otherwise
    a packet scheduler init failure will make a device inoperative. From
    Jesper Dangaard Brouer.

    18) Several RISCV bpf jit optimizations, from Luke Nelson.

    19) Correct the return type of the ->ndo_start_xmit() method in several
    drivers, it's netdev_tx_t but many drivers were using
    'int'. From Yunjian Wang.

    20) Add an ethtool interface for PHY master/slave config, from Oleksij
    Rempel.

    21) Add BPF iterators, from Yonghang Song.

    22) Add cable test infrastructure, including ethool interfaces, from
    Andrew Lunn. Marvell PHY driver is the first to support this
    facility.

    23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

    24) Calculate and maintain an explicit frame size in XDP, from Jesper
    Dangaard Brouer.

    25) Add CAP_BPF, from Alexei Starovoitov.

    26) Support terse dumps in the packet scheduler, from Vlad Buslov.

    27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

    28) Add devm_register_netdev(), from Bartosz Golaszewski.

    29) Minimize qdisc resets, from Cong Wang.

    30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
    eliminate set_fs/get_fs calls. From Christoph Hellwig.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
    selftests: net: ip_defrag: ignore EPERM
    net_failover: fixed rollback in net_failover_open()
    Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
    Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
    vmxnet3: allow rx flow hash ops only when rss is enabled
    hinic: add set_channels ethtool_ops support
    selftests/bpf: Add a default $(CXX) value
    tools/bpf: Don't use $(COMPILE.c)
    bpf, selftests: Use bpf_probe_read_kernel
    s390/bpf: Use bcr 0,%0 as tail call nop filler
    s390/bpf: Maintain 8-byte stack alignment
    selftests/bpf: Fix verifier test
    selftests/bpf: Fix sample_cnt shared between two threads
    bpf, selftests: Adapt cls_redirect to call csum_level helper
    bpf: Add csum_level helper for fixing up csum levels
    bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
    sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
    crypto/chtls: IPv6 support for inline TLS
    Crypto/chcr: Fixes a coccinile check error
    Crypto/chcr: Fixes compilations warnings
    ...

    Linus Torvalds
     

02 Jun, 2020

2 commits

  • Pull RCU updates from Ingo Molnar:
    "The RCU updates for this cycle were:

    - RCU-tasks update, including addition of RCU Tasks Trace for BPF use
    and TASKS_RUDE_RCU

    - kfree_rcu() updates.

    - Remove scheduler locking restriction

    - RCU CPU stall warning updates.

    - Torture-test updates.

    - Miscellaneous fixes and other updates"

    * tag 'core-rcu-2020-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
    rcu: Allow for smp_call_function() running callbacks from idle
    rcu: Provide rcu_irq_exit_check_preempt()
    rcu: Abstract out rcu_irq_enter_check_tick() from rcu_nmi_enter()
    rcu: Provide __rcu_is_watching()
    rcu: Provide rcu_irq_exit_preempt()
    rcu: Make RCU IRQ enter/exit functions rely on in_nmi()
    rcu/tree: Mark the idle relevant functions noinstr
    x86: Replace ist_enter() with nmi_enter()
    x86/mce: Send #MC singal from task work
    x86/entry: Get rid of ist_begin/end_non_atomic()
    sched,rcu,tracing: Avoid tracing before in_nmi() is correct
    sh/ftrace: Move arch_ftrace_nmi_{enter,exit} into nmi exception
    lockdep: Always inline lockdep_{off,on}()
    hardirq/nmi: Allow nested nmi_enter()
    arm64: Prepare arch_nmi_enter() for recursion
    printk: Disallow instrumenting print_nmi_enter()
    printk: Prepare for nested printk_nmi_enter()
    rcutorture: Convert ULONG_CMP_LT() to time_before()
    torture: Add a --kasan argument
    torture: Save a few lines by using config_override_param initially
    ...

    Linus Torvalds
     
  • Pull printk updates from Petr Mladek:

    - Benjamin Herrenschmidt solved a problem with non-matched console
    aliases by first checking consoles defined on the command line. It is
    a more conservative approach than the previous attempts.

    - Benjamin also made sure that the console accessible via /dev/console
    always has CON_CONSDEV flag.

    - Andy Shevchenko added the %ptT modifier for printing struct time64_t.
    It extends the existing %ptR handling for struct rtc_time.

    - Bruno Meneguele fixed /dev/kmsg error value returned by unsupported
    SEEK_CUR.

    - Tetsuo Handa removed unused pr_cont_once().

    ... and a few small fixes.

    * tag 'printk-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
    printk: Remove pr_cont_once()
    printk: handle blank console arguments passed in.
    kernel/printk: add kmsg SEEK_CUR handling
    printk: Fix a typo in comment "interator"->"iterator"
    usb: pulse8-cec: Switch to use %ptT
    ARM: bcm2835: Switch to use %ptT
    lib/vsprintf: Print time64_t in human readable format
    lib/vsprintf: update comment about simple_strto() functions
    printk: Correctly set CON_CONSDEV even when preferred console was not registered
    printk: Fix preferred console selection with multiple matches
    printk: Move console matching logic into a separate function
    printk: Convert a use of sprintf to snprintf in console_unlock

    Linus Torvalds