20 Jan, 2021

1 commit

  • [ Upstream commit 1b04fa9900263b4e217ca2509fd778b32c2b4eb2 ]

    PowerPC testing encountered boot failures due to RCU Tasks not being
    fully initialized until core_initcall() time. This commit therefore
    initializes RCU Tasks (along with Rude RCU and RCU Tasks Trace) just
    before early_initcall() time, thus allowing waiting on RCU Tasks grace
    periods from early_initcall() handlers.

    Link: https://lore.kernel.org/rcu/87eekfh80a.fsf@dja-thinkpad.axtens.net/
    Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
    Tested-by: Daniel Axtens
    Signed-off-by: Uladzislau Rezki (Sony)
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Sasha Levin

    Uladzislau Rezki (Sony)
     

01 Dec, 2020

1 commit

  • Load the size and the checksum fields in the footer as le32
    instead of u32. This will allow us to apply bootconfig to the
    cross build initrd without caring the endianness.

    Link: https://lkml.kernel.org/r/160583934457.547349.10504070298990791074.stgit@devnote2

    Reported-by: Steven Rostedt
    Suggested-by: Linus Torvalds
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

13 Nov, 2020

1 commit

  • Since Grub may align the size of initrd to 4 if user pass
    initrd from cpio, we have to check the preceding 3 bytes as well.

    Link: https://lkml.kernel.org/r/160520205132.303174.4876760192433315429.stgit@devnote2

    Cc: stable@vger.kernel.org
    Fixes: 85c46b78da58 ("bootconfig: Add bootconfig magic word for indicating bootconfig explicitly")
    Reported-by: Chen Yu
    Tested-by: Chen Yu
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

19 Oct, 2020

1 commit

  • …/kernel/git/shuah/linux-kselftest

    Pull more Kunit updates from Shuah Khan:

    - add Kunit to kernel_init() and remove KUnit from init calls entirely.

    This addresses the concern that Kunit would not work correctly during
    late init phase.

    - add a linker section where KUnit can put references to its test
    suites.

    This is the first step in transitioning to dispatching all KUnit
    tests from a centralized executor rather than having each as its own
    separate late_initcall.

    - add a centralized executor to dispatch tests rather than relying on
    late_initcall to schedule each test suite separately. Centralized
    execution is for built-in tests only; modules will execute tests when
    loaded.

    - convert bitfield test to use KUnit framework

    - Documentation updates for naming guidelines and how
    kunit_test_suite() works.

    - add test plan to KUnit TAP format

    * tag 'linux-kselftest-kunit-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
    lib: kunit: Fix compilation test when using TEST_BIT_FIELD_COMPILE
    lib: kunit: add bitfield test conversion to KUnit
    Documentation: kunit: add a brief blurb about kunit_test_suite
    kunit: test: add test plan to KUnit TAP format
    init: main: add KUnit to kernel init
    kunit: test: create a single centralized executor for all tests
    vmlinux.lds.h: add linker section for KUnit test suites
    Documentation: kunit: Add naming guidelines

    Linus Torvalds
     

16 Oct, 2020

1 commit

  • Pull trivial updates from Jiri Kosina:
    "The latest advances in computer science from the trivial queue"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    xtensa: fix Kconfig typo
    spelling.txt: Remove some duplicate entries
    mtd: rawnand: oxnas: cleanup/simplify code
    selftests: vm: add fragment CONFIG_GUP_BENCHMARK
    perf: Fix opt help text for --no-bpf-event
    HID: logitech-dj: Fix spelling in comment
    bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG
    MAINTAINERS: rectify MMP SUPPORT after moving cputype.h
    scif: Fix spelling of EACCES
    printk: fix global comment
    lib/bitmap.c: fix spello
    fs: Fix missing 'bit' in comment

    Linus Torvalds
     

10 Oct, 2020

1 commit

  • Although we have not seen any actual examples where KUnit doesn't work
    because it runs in the late init phase of the kernel, it has been a
    concern for some time that this could potentially be an issue in the
    future. So, remove KUnit from init calls entirely, instead call directly
    from kernel_init() so that KUnit runs after late init.

    Co-developed-by: Alan Maguire
    Signed-off-by: Alan Maguire
    Signed-off-by: Brendan Higgins
    Reviewed-by: Stephen Boyd
    Reviewed-by: Kees Cook
    Reviewed-by: Luis Chamberlain
    Signed-off-by: Shuah Khan

    Brendan Higgins
     

19 Sep, 2020

2 commits

  • This eliminates the following sparse warning:

    init/main.c:306:6: warning: symbol 'xbc_namebuf' was not declared.
    Should it be static?

    Link: https://lkml.kernel.org/r/20200915070324.2239473-1-yanaijie@huawei.com

    Reported-by: Hulk Robot
    Acked-by: Masami Hiramatsu
    Signed-off-by: Jason Yan
    Signed-off-by: Steven Rostedt (VMware)

    Jason Yan
     
  • Since kprobe_event= cmdline option allows user to put kprobes on the
    functions in initmem, kprobe has to make such probes gone after boot.
    Currently the probes on the init functions in modules will be handled
    by module callback, but the kernel init text isn't handled.
    Without this, kprobes may access non-exist text area to disable or
    remove it.

    Link: https://lkml.kernel.org/r/159972810544.428528.1839307531600646955.stgit@devnote2

    Fixes: 970988e19eb0 ("tracing/kprobe: Add kprobe_event= boot parameter")
    Cc: Jonathan Corbet
    Cc: Shuah Khan
    Cc: Randy Dunlap
    Cc: Ingo Molnar
    Cc: stable@vger.kernel.org
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

01 Sep, 2020

1 commit


08 Aug, 2020

4 commits

  • Pull tracing updates from Steven Rostedt:

    - The biggest news in that the tracing ring buffer can now time events
    that interrupted other ring buffer events.

    Before this change, if an interrupt came in while recording another
    event, and that interrupt also had an event, those events would all
    have the same time stamp as the event it interrupted.

    Now, with the new design, those events will have a unique time stamp
    and rightfully display the time for those events that were recorded
    while interrupting another event.

    - Bootconfig how has an "override" operator that lets the users have a
    default config, but then add options to override the default.

    - A fix was made to properly filter function graph tracing to the
    ftrace PIDs. This came in at the end of the -rc cycle, and needs to
    be backported.

    - Several clean ups, performance updates, and minor fixes as well.

    * tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (39 commits)
    tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers
    kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE
    tracing: Use trace_sched_process_free() instead of exit() for pid tracing
    bootconfig: Fix to find the initargs correctly
    Documentation: bootconfig: Add bootconfig override operator
    tools/bootconfig: Add testcases for value override operator
    lib/bootconfig: Add override operator support
    kprobes: Remove show_registers() function prototype
    tracing/uprobe: Remove dead code in trace_uprobe_register()
    kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler
    ftrace: Fix ftrace_trace_task return value
    tracepoint: Use __used attribute definitions from compiler_attributes.h
    tracepoint: Mark __tracepoint_string's __used
    trace : Have tracing buffer info use kvzalloc instead of kzalloc
    tracing: Remove outdated comment in stack handling
    ftrace: Do not let direct or IPMODIFY ftrace_ops be added to module and set trampolines
    ftrace: Setup correct FTRACE_FL_REGS flags for module
    tracing/hwlat: Honor the tracing_cpumask
    tracing/hwlat: Drop the duplicate assignment in start_kthread()
    tracing: Save one trace_event->type by using __TRACE_LAST_TYPE
    ...

    Linus Torvalds
     
  • Merge misc updates from Andrew Morton:

    - a few MM hotfixes

    - kthread, tools, scripts, ntfs and ocfs2

    - some of MM

    Subsystems affected by this patch series: kthread, tools, scripts, ntfs,
    ocfs2 and mm (hofixes, pagealloc, slab-generic, slab, slub, kcsan,
    debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, mincore,
    sparsemem, vmalloc, kasan, pagealloc, hugetlb and vmscan).

    * emailed patches from Andrew Morton : (162 commits)
    mm: vmscan: consistent update to pgrefill
    mm/vmscan.c: fix typo
    khugepaged: khugepaged_test_exit() check mmget_still_valid()
    khugepaged: retract_page_tables() remember to test exit
    khugepaged: collapse_pte_mapped_thp() protect the pmd lock
    khugepaged: collapse_pte_mapped_thp() flush the right range
    mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
    mm: thp: replace HTTP links with HTTPS ones
    mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
    mm/page_alloc.c: skip setting nodemask when we are in interrupt
    mm/page_alloc: fallbacks at most has 3 elements
    mm/page_alloc: silence a KASAN false positive
    mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()
    mm/page_alloc.c: simplify pageblock bitmap access
    mm/page_alloc.c: extract the common part in pfn_to_bitidx()
    mm/page_alloc.c: replace the definition of NR_MIGRATETYPE_BITS with PB_migratetype_bits
    mm/shuffle: remove dynamic reconfiguration
    mm/memory_hotplug: document why shuffle_zone() is relevant
    mm/page_alloc: remove nr_free_pagecache_pages()
    mm: remove vm_total_pages
    ...

    Linus Torvalds
     
  • This patch prepares Software Tag-Based KASAN for stack tagging support.

    With stack tagging enabled, KASAN tags stack variable in each function in
    its prologue. In start_kernel() stack variables get tagged before KASAN
    is enabled via setup_arch()->kasan_init(). As the result the tags for
    start_kernel()'s stack variables end up in the temporary shadow memory.
    Later when KASAN gets enabled, switched to normal shadow, and starts
    checking tags, this leads to false-positive reports, as proper tags are
    missing in normal shadow.

    Disable KASAN instrumentation for start_kernel(). Also disable it for
    arm64's setup_arch() as a precaution (it doesn't have any stack variables
    right now).

    [andreyknvl@google.com: reorder attributes for start_kernel()]
    Link: http://lkml.kernel.org/r/26fb6165a17abcf61222eda5184c030fb6b133d1.1596544734.git.andreyknvl@google.com

    Signed-off-by: Andrey Konovalov
    Signed-off-by: Andrew Morton
    Acked-by: Catalin Marinas [arm64]
    Cc: Alexander Potapenko
    Cc: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Elena Petrova
    Cc: Marco Elver
    Cc: Vincenzo Frascino
    Cc: Walter Wu
    Cc: Ard Biesheuvel
    Link: http://lkml.kernel.org/r/55d432671a92e931ab8234b03dc36b14d4c21bfb.1596199677.git.andreyknvl@google.com
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • Pull init and set_fs() cleanups from Al Viro:
    "Christoph's 'getting rid of ksys_...() uses under KERNEL_DS' series"

    * 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (50 commits)
    init: add an init_dup helper
    init: add an init_utimes helper
    init: add an init_stat helper
    init: add an init_mknod helper
    init: add an init_mkdir helper
    init: add an init_symlink helper
    init: add an init_link helper
    init: add an init_eaccess helper
    init: add an init_chmod helper
    init: add an init_chown helper
    init: add an init_chroot helper
    init: add an init_chdir helper
    init: add an init_rmdir helper
    init: add an init_unlink helper
    init: add an init_umount helper
    init: add an init_mount helper
    init: mark create_dev as __init
    init: mark console_on_rootfs as __init
    init: initialize ramdisk_execute_command at compile time
    devtmpfs: refactor devtmpfsd()
    ...

    Linus Torvalds
     

05 Aug, 2020

3 commits

  • Add a simple helper to grab a reference to a file and install it at
    the next available fd, and switch the early init code over to it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Pull execve updates from Eric Biederman:
    "During the development of v5.7 I ran into bugs and quality of
    implementation issues related to exec that could not be easily fixed
    because of the way exec is implemented. So I have been diggin into
    exec and cleaning up what I can.

    This cycle I have been looking at different ideas and different
    implementations to see what is possible to improve exec, and cleaning
    the way exec interfaces with in kernel users. Only cleaning up the
    interfaces of exec with rest of the kernel has managed to stabalize
    and make it through review in time for v5.9-rc1 resulting in 2 sets of
    changes this cycle.

    - Implement kernel_execve

    - Make the user mode driver code a better citizen

    With kernel_execve the code size got a little larger as the copying of
    parameters from userspace and copying of parameters from userspace is
    now separate. The good news is kernel threads no longer need to play
    games with set_fs to use exec. Which when combined with the rest of
    Christophs set_fs changes should security bugs with set_fs much more
    difficult"

    * 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits)
    exec: Implement kernel_execve
    exec: Factor bprm_stack_limits out of prepare_arg_pages
    exec: Factor bprm_execve out of do_execve_common
    exec: Move bprm_mm_init into alloc_bprm
    exec: Move initialization of bprm->filename into alloc_bprm
    exec: Factor out alloc_bprm
    exec: Remove unnecessary spaces from binfmts.h
    umd: Stop using split_argv
    umd: Remove exit_umh
    bpfilter: Take advantage of the facilities of struct pid
    exit: Factor thread_group_exited out of pidfd_poll
    umd: Track user space drivers with struct pid
    bpfilter: Move bpfilter_umh back into init data
    exec: Remove do_execve_file
    umh: Stop calling do_execve_file
    umd: Transform fork_usermode_blob into fork_usermode_driver
    umd: Rename umd_info.cmdline umd_info.driver_name
    umd: For clarity rename umh_info umd_info
    umh: Separate the user mode driver and the user mode helper support
    umh: Remove call_usermodehelper_setup_file.
    ...

    Linus Torvalds
     
  • Since the parse_args() stops parsing at '--', bootconfig_params()
    will never get the '--' as param and initargs_found never be true.
    In the result, if we pass some init arguments via the bootconfig,
    those are always appended to the kernel command line with '--'
    even if the kernel command line already has '--'.

    To fix this correctly, check the return value of parse_args()
    and set initargs_found true if the return value is not an error
    but a valid address.

    Link: https://lkml.kernel.org/r/159650953285.270383.14822353843556363851.stgit@devnote2

    Fixes: f61872bb58a1 ("bootconfig: Use parse_args() to find bootconfig and '--'")
    Cc: stable@vger.kernel.org
    Reported-by: Arvind Sankar
    Suggested-by: Arvind Sankar
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

31 Jul, 2020

4 commits


21 Jul, 2020

1 commit

  • To allow the kernel not to play games with set_fs to call exec
    implement kernel_execve. The function kernel_execve takes pointers
    into kernel memory and copies the values pointed to onto the new
    userspace stack.

    The calls with arguments from kernel space of do_execve are replaced
    with calls to kernel_execve.

    The calls do_execve and do_execveat are made static as there are now
    no callers outside of exec.

    The comments that mention do_execve are updated to refer to
    kernel_execve or execve depending on the circumstances. In addition
    to correcting the comments, this makes it easy to grep for do_execve
    and verify it is not used.

    Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de
    Reviewed-by: Kees Cook
    Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

16 Jun, 2020

1 commit

  • In addition to -ftrivial-auto-var-init=pattern (used by
    CONFIG_INIT_STACK_ALL now) Clang also supports zero initialization for
    locals enabled by -ftrivial-auto-var-init=zero. The future of this flag
    is still being debated (see https://bugs.llvm.org/show_bug.cgi?id=45497).
    Right now it is guarded by another flag,
    -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang,
    which means it may not be supported by future Clang releases. Another
    possible resolution is that -ftrivial-auto-var-init=zero will persist
    (as certain users have already started depending on it), but the name
    of the guard flag will change.

    In the meantime, zero initialization has proven itself as a good
    production mitigation measure against uninitialized locals. Unlike pattern
    initialization, which has a higher chance of triggering existing bugs,
    zero initialization provides safe defaults for strings, pointers, indexes,
    and sizes. On the other hand, pattern initialization remains safer for
    return values. Chrome OS and Android are moving to using zero
    initialization for production builds.

    Performance-wise, the difference between pattern and zero initialization
    is usually negligible, although the generated code for zero
    initialization is more compact.

    This patch renames CONFIG_INIT_STACK_ALL to CONFIG_INIT_STACK_ALL_PATTERN
    and introduces another config option, CONFIG_INIT_STACK_ALL_ZERO, that
    enables zero initialization for locals if the corresponding flags are
    supported by Clang.

    Cc: Kees Cook
    Cc: Nick Desaulniers
    Cc: Greg Kroah-Hartman
    Signed-off-by: Alexander Potapenko
    Link: https://lore.kernel.org/r/20200616083435.223038-1-glider@google.com
    Reviewed-by: Maciej Żenczykowski
    Signed-off-by: Kees Cook

    glider@google.com
     

12 Jun, 2020

1 commit

  • Merge the state of the locking kcsan branch before the read/write_once()
    and the atomics modifications got merged.

    Squash the fallout of the rebase on top of the read/write once and atomic
    fallback work into the merge. The history of the original branch is
    preserved in tag locking-kcsan-2020-06-02.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

09 Jun, 2020

1 commit

  • Patch series "support setting sysctl parameters from kernel command line", v3.

    This series adds support for something that seems like many people
    always wanted but nobody added it yet, so here's the ability to set
    sysctl parameters via kernel command line options in the form of
    sysctl.vm.something=1

    The important part is Patch 1. The second, not so important part is an
    attempt to clean up legacy one-off parameters that do the same thing as
    a sysctl. I don't want to remove them completely for compatibility
    reasons, but with generic sysctl support the idea is to remove the
    one-off param handlers and treat the parameters as aliases for the
    sysctl variants.

    I have identified several parameters that mention sysctl counterparts in
    Documentation/admin-guide/kernel-parameters.txt but there might be more.
    The conversion also has varying level of success:

    - numa_zonelist_order is converted in Patch 2 together with adding the
    necessary infrastructure. It's easy as it doesn't really do anything
    but warn on deprecated value these days.

    - hung_task_panic is converted in Patch 3, but there's a downside that
    now it only accepts 0 and 1, while previously it was any integer
    value

    - nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic,
    so there's no straighforward conversion possible

    - traceoff_on_warning is a flag without value and it would be required
    to handle that somehow in the conversion infractructure, which seems
    pointless for a single flag

    This patch (of 5):

    A recently proposed patch to add vm_swappiness command line parameter in
    addition to existing sysctl [1] made me wonder why we don't have a
    general support for passing sysctl parameters via command line.

    Googling found only somebody else wondering the same [2], but I haven't
    found any prior discussion with reasons why not to do this.

    Settings the vm_swappiness issue aside (the underlying issue might be
    solved in a different way), quick search of kernel-parameters.txt shows
    there are already some that exist as both sysctl and kernel parameter -
    hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning.

    A general mechanism would remove the need to add more of those one-offs
    and might be handy in situations where configuration by e.g.
    /etc/sysctl.d/ is impractical.

    Hence, this patch adds a new parse_args() pass that looks for parameters
    prefixed by 'sysctl.' and tries to interpret them as writes to the
    corresponding sys/ files using an temporary in-kernel procfs mount.
    This mechanism was suggested by Eric W. Biederman [3], as it handles
    all dynamically registered sysctl tables, even though we don't handle
    modular sysctls. Errors due to e.g. invalid parameter name or value
    are reported in the kernel log.

    The processing is hooked right before the init process is loaded, as
    some handlers might be more complicated than simple setters and might
    need some subsystems to be initialized. At the moment the init process
    can be started and eventually execute a process writing to /proc/sys/
    then it should be also fine to do that from the kernel.

    Sysctls registered later on module load time are not set by this
    mechanism - it's expected that in such scenarios, setting sysctl values
    from userspace is practical enough.

    [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
    [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
    [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/

    Signed-off-by: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Reviewed-by: Luis Chamberlain
    Reviewed-by: Masami Hiramatsu
    Acked-by: Kees Cook
    Acked-by: Michal Hocko
    Cc: Iurii Zaikin
    Cc: Ivan Teterevkov
    Cc: Michal Hocko
    Cc: David Rientjes
    Cc: Matthew Wilcox
    Cc: "Eric W . Biederman"
    Cc: "Guilherme G . Piccoli"
    Cc: Alexey Dobriyan
    Cc: Thomas Gleixner
    Cc: Greg Kroah-Hartman
    Cc: Christian Brauner
    Link: http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz
    Link: http://lkml.kernel.org/r/20200427180433.7029-2-vbabka@suse.cz
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     

05 Jun, 2020

1 commit

  • Some init systems (eg. systemd) have init at their own paths, for
    example, /usr/lib/systemd/systemd. A compatibility symlink to one of the
    hardcoded init paths is provided by another package, usually named
    something like systemd-sysvcompat or similar.

    Currently distro maintainers who are hands-off on the bootloader are more
    or less required to include those compatibility links as part of their
    base distribution, because it's hard to migrate away from them since
    there's a risk some users will not get the message to set init= on the
    kernel command line appropriately.

    Moreover, for distributions where the init system is something the
    distribution itself is opinionated about (eg. Arch, which has systemd in
    the required `base` package), we could usually reasonably configure this
    ahead of time when building the distribution kernel. However, we
    currently simply don't have any way to configure the kernel to do this.
    Here's an example discussion where removing sysvcompat was discussed by
    distro maintainers[0].

    This patch adds a new Kconfig tunable, CONFIG_DEFAULT_INIT, which if set
    is tried before the hardcoded fallback list. So the order of precedence
    is now thus:

    1. init= on command line (on failure: panic)
    2. CONFIG_DEFAULT_INIT (on failure: try #3)
    3. Hardcoded fallback list (on failure: panic)

    This new config parameter will allow distribution maintainers to move away
    from these compatibility links safely, without having to worry that their
    users might not have the right init=.

    There are also two other benefits of this over having the distribution
    maintain a symlink:

    1. One of the value propositions over simply having distributions
    maintain a /sbin/init symlink via a package is that it also frees
    distributions which have a preferred default, but not mandatory, init
    system from having their package manager fight with their users for
    control of /{s,}bin/init. Instead, the distribution simply makes
    their preference known in CONFIG_DEFAULT_INIT, and if the user
    installs another init system and uninstalls the default one they can
    still make use of /{s,}bin/init and friends for their own uses. This
    makes more cases Just Work(tm) without the user having to perform
    extra configuration via init=.

    2. Since before this we don't know which path the distribution actually
    _intends_ to serve init from, we don't pr_err if it is simply
    missing, and usually will just silently put the user in a /bin/sh
    shell. Now that the distribution can make a declaration of intent, we
    can be more vocal when this init system fails to launch for any
    reason, even if it's simply because no file exists at that location,
    speeding up the palaver of init/mount dependency/etc debugging a bit.

    [0]: https://lists.archlinux.org/pipermail/arch-dev-public/2019-January/029435.html

    Signed-off-by: Chris Down
    Signed-off-by: Andrew Morton
    Cc: Greg Kroah-Hartman
    Cc: Masami Hiramatsu
    Link: http://lkml.kernel.org/r/20200522160234.GA1487022@chrisdown.name
    Signed-off-by: Linus Torvalds

    Chris Down
     

04 Jun, 2020

1 commit

  • padata will soon initialize the system's struct pages in parallel, so it
    needs to be ready by page_alloc_init_late().

    The error return from padata_driver_init() triggers an initcall warning,
    so add a warning to padata_init() to avoid silent failure.

    Signed-off-by: Daniel Jordan
    Signed-off-by: Andrew Morton
    Tested-by: Josh Triplett
    Cc: Alexander Duyck
    Cc: Alex Williamson
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Hildenbrand
    Cc: Herbert Xu
    Cc: Jason Gunthorpe
    Cc: Jonathan Corbet
    Cc: Kirill Tkhai
    Cc: Michal Hocko
    Cc: Pavel Machek
    Cc: Pavel Tatashin
    Cc: Peter Zijlstra
    Cc: Randy Dunlap
    Cc: Robert Elliott
    Cc: Shile Zhang
    Cc: Steffen Klassert
    Cc: Steven Sistare
    Cc: Tejun Heo
    Cc: Zi Yan
    Link: http://lkml.kernel.org/r/20200527173608.2885243-3-daniel.m.jordan@oracle.com
    Signed-off-by: Linus Torvalds

    Daniel Jordan
     

18 May, 2020

1 commit


15 May, 2020

1 commit

  • ... or the odyssey of trying to disable the stack protector for the
    function which generates the stack canary value.

    The whole story started with Sergei reporting a boot crash with a kernel
    built with gcc-10:

    Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
    CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
    Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
    Call Trace:
    dump_stack
    panic
    ? start_secondary
    __stack_chk_fail
    start_secondary
    secondary_startup_64
    -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary

    This happens because gcc-10 tail-call optimizes the last function call
    in start_secondary() - cpu_startup_entry() - and thus emits a stack
    canary check which fails because the canary value changes after the
    boot_init_stack_canary() call.

    To fix that, the initial attempt was to mark the one function which
    generates the stack canary with:

    __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)

    however, using the optimize attribute doesn't work cumulatively
    as the attribute does not add to but rather replaces previously
    supplied optimization options - roughly all -fxxx options.

    The key one among them being -fno-omit-frame-pointer and thus leading to
    not present frame pointer - frame pointer which the kernel needs.

    The next attempt to prevent compilers from tail-call optimizing
    the last function call cpu_startup_entry(), shy of carving out
    start_secondary() into a separate compilation unit and building it with
    -fno-stack-protector, was to add an empty asm("").

    This current solution was short and sweet, and reportedly, is supported
    by both compilers but we didn't get very far this time: future (LTO?)
    optimization passes could potentially eliminate this, which leads us
    to the third attempt: having an actual memory barrier there which the
    compiler cannot ignore or move around etc.

    That should hold for a long time, but hey we said that about the other
    two solutions too so...

    Reported-by: Sergei Trofimovich
    Signed-off-by: Borislav Petkov
    Tested-by: Kalle Valo
    Cc:
    Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org

    Borislav Petkov
     

12 May, 2020

1 commit

  • Commit de462e5f1071 ("bootconfig: Fix to remove bootconfig
    data from initrd while boot") causes a cosmetic regression
    on dmesg, which warns "no bootconfig data" message without
    bootconfig cmdline option.

    Fix setup_boot_config() by moving no bootconfig check after
    commandline option check.

    Link: http://lkml.kernel.org/r/9b1ba335-071d-c983-89a4-2677b522dcc8@molgen.mpg.de
    Link: http://lkml.kernel.org/r/158916116468.21787.14558782332170588206.stgit@devnote2

    Fixes: de462e5f1071 ("bootconfig: Fix to remove bootconfig data from initrd while boot")
    Reported-by: Paul Menzel
    Reviewed-by: Paul Menzel
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

06 May, 2020

1 commit

  • If there is a bootconfig data in the tail of initrd/initramfs,
    initrd image sanity check caused an error while decompression
    stage as follows.

    [ 0.883882] Unpacking initramfs...
    [ 2.696429] Initramfs unpacking failed: invalid magic at start of compressed archive

    This error will be ignored if CONFIG_BLK_DEV_RAM=n,
    but CONFIG_BLK_DEV_RAM=y the kernel failed to mount rootfs
    and causes a panic.

    To fix this issue, shrink down the initrd_end for removing
    tailing bootconfig data while boot the kernel.

    Link: http://lkml.kernel.org/r/158788401014.24243.17424755854115077915.stgit@devnote2

    Cc: Borislav Petkov
    Cc: Kees Cook
    Cc: Ingo Molnar
    Cc: Andrew Morton
    Cc: stable@vger.kernel.org
    Fixes: 7684b8582c24 ("bootconfig: Load boot config from the tail of initrd")
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

13 Apr, 2020

1 commit


11 Apr, 2020

1 commit

  • printk_deferred(), similarly to printk_safe/printk_nmi, does not
    immediately attempt to print a new message on the consoles, avoiding
    calls into non-reentrant kernel paths, e.g. scheduler or timekeeping,
    which potentially can deadlock the system.

    Those printk() flavors, instead, rely on per-CPU flush irq_work to print
    messages from safer contexts. For same reasons (recursive scheduler or
    timekeeping calls) printk() uses per-CPU irq_work in order to wake up
    user space syslog/kmsg readers.

    However, only printk_safe/printk_nmi do make sure that per-CPU areas
    have been initialised and that it's safe to modify per-CPU irq_work.
    This means that, for instance, should printk_deferred() be invoked "too
    early", that is before per-CPU areas are initialised, printk_deferred()
    will perform illegal per-CPU access.

    Lech Perczak [0] reports that after commit 1b710b1b10ef ("char/random:
    silence a lockdep splat with printk()") user-space syslog/kmsg readers
    are not able to read new kernel messages.

    The reason is printk_deferred() being called too early (as was pointed
    out by Petr and John).

    Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU
    areas are initialized.

    Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/
    Reported-by: Lech Perczak
    Signed-off-by: Sergey Senozhatsky
    Tested-by: Jann Horn
    Reviewed-by: Petr Mladek
    Cc: Greg Kroah-Hartman
    Cc: Theodore Ts'o
    Cc: John Ogness
    Signed-off-by: Linus Torvalds

    Sergey Senozhatsky
     

21 Mar, 2020

2 commits


04 Mar, 2020

1 commit

  • Show line and column when we got a parse error in bootconfig tool.
    Current lib/bootconfig shows the parse error with byte offset, but
    that is not human readable.
    This makes xbc_init() not showing error message itself but able to
    pass the error message and position to caller, so that the caller
    can decode it and show the error message with line number and columns.

    With this patch, bootconfig tool shows an error with line:column as
    below.

    $ cat samples/bad-dotword.bconf
    # do not start keyword with .
    key {
    .word = 1
    }
    $ ./bootconfig -a samples/bad-dotword.bconf initrd
    Parse Error: Invalid keyword at 3:3

    Link: http://lkml.kernel.org/r/158323469002.10560.4023923847704522760.stgit@devnote2

    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

27 Feb, 2020

1 commit

  • Pull tracing and bootconfig updates:
    "Fixes and changes to bootconfig before it goes live in a release.

    Change in API of bootconfig (before it comes live in a release):
    - Have a magic value "BOOTCONFIG" in initrd to know a bootconfig
    exists
    - Set CONFIG_BOOT_CONFIG to 'n' by default
    - Show error if "bootconfig" on cmdline but not compiled in
    - Prevent redefining the same value
    - Have a way to append values
    - Added a SELECT BLK_DEV_INITRD to fix a build failure

    Synthetic event fixes:
    - Switch to raw_smp_processor_id() for recording CPU value in preempt
    section. (No care for what the value actually is)
    - Fix samples always recording u64 values
    - Fix endianess
    - Check number of values matches number of fields
    - Fix a printing bug

    Fix of trace_printk() breaking postponed start up tests

    Make a function static that is only used in a single file"

    * tag 'trace-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    bootconfig: Fix CONFIG_BOOTTIME_TRACING dependency issue
    bootconfig: Add append value operator support
    bootconfig: Prohibit re-defining value on same key
    bootconfig: Print array as multiple commands for legacy command line
    bootconfig: Reject subkey and value on same parent key
    tools/bootconfig: Remove unneeded error message silencer
    bootconfig: Add bootconfig magic word for indicating bootconfig explicitly
    bootconfig: Set CONFIG_BOOT_CONFIG=n by default
    tracing: Clear trace_state when starting trace
    bootconfig: Mark boot_config_checksum() static
    tracing: Disable trace_printk() on post poned tests
    tracing: Have synthetic event test use raw_smp_processor_id()
    tracing: Fix number printing bug in print_synth_event()
    tracing: Check that number of vals matches number of synth event fields
    tracing: Make synth_event trace functions endian-correct
    tracing: Make sure synth_event_trace() example always uses u64

    Linus Torvalds
     

21 Feb, 2020

4 commits

  • Print arraied values as multiple same options for legacy
    kernel command line. With this rule, if the "kernel.*" and
    "init.*" array entries in bootconfig are printed out as
    multiple same options, e.g.

    kernel {
    console = "ttyS0,115200"
    console += "tty0"
    }

    will be correctly converted to

    console="ttyS0,115200" console="tty0"

    in the kernel command line.

    Link: http://lkml.kernel.org/r/158220118213.26565.8163300497009463916.stgit@devnote2

    Reported-by: Borislav Petkov
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     
  • Add bootconfig magic word to the end of bootconfig on initrd
    image for indicating explicitly the bootconfig is there.
    Also tools/bootconfig treats wrong size or wrong checksum or
    parse error as an error, because if there is a bootconfig magic
    word, there must be a bootconfig.

    The bootconfig magic word is "#BOOTCONFIG\n", 12 bytes word.
    Thus the block image of the initrd file with bootconfig is
    as follows.

    [Initrd][bootconfig][size][csum][#BOOTCONFIG\n]

    Link: http://lkml.kernel.org/r/158220112263.26565.3944814205960612841.stgit@devnote2

    Suggested-by: Steven Rostedt
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     
  • Set CONFIG_BOOT_CONFIG=n by default. This also warns
    user if CONFIG_BOOT_CONFIG=n but "bootconfig" is given
    in the kernel command line.

    Link: http://lkml.kernel.org/r/158220111291.26565.9036889083940367969.stgit@devnote2

    Suggested-by: Steven Rostedt
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     
  • In fact, this function is only used in this file, so mark it with 'static'.

    Link: http://lkml.kernel.org/r/1581852511-14163-1-git-send-email-hqjagain@gmail.com

    Acked-by: Masami Hiramatsu
    Signed-off-by: Qiujun Huang
    Signed-off-by: Steven Rostedt (VMware)

    Qiujun Huang