06 Jan, 2019

2 commits

  • Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".

    The jump label is controlled by HAVE_JUMP_LABEL, which is defined
    like this:

    #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
    # define HAVE_JUMP_LABEL
    #endif

    We can improve this by testing 'asm goto' support in Kconfig, then
    make JUMP_LABEL depend on CC_HAS_ASM_GOTO.

    Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
    match to the real kernel capability.

    Signed-off-by: Masahiro Yamada
    Acked-by: Michael Ellerman (powerpc)
    Tested-by: Sedat Dilek

    Masahiro Yamada
     
  • Pull vfs mount API prep from Al Viro:
    "Mount API prereqs.

    Mostly that's LSM mount options cleanups. There are several minor
    fixes in there, but nothing earth-shattering (leaks on failure exits,
    mostly)"

    * 'mount.part1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (27 commits)
    mount_fs: suppress MAC on MS_SUBMOUNT as well as MS_KERNMOUNT
    smack: rewrite smack_sb_eat_lsm_opts()
    smack: get rid of match_token()
    smack: take the guts of smack_parse_opts_str() into a new helper
    LSM: new method: ->sb_add_mnt_opt()
    selinux: rewrite selinux_sb_eat_lsm_opts()
    selinux: regularize Opt_... names a bit
    selinux: switch away from match_token()
    selinux: new helper - selinux_add_opt()
    LSM: bury struct security_mnt_opts
    smack: switch to private smack_mnt_opts
    selinux: switch to private struct selinux_mnt_opts
    LSM: hide struct security_mnt_opts from any generic code
    selinux: kill selinux_sb_get_mnt_opts()
    LSM: turn sb_eat_lsm_opts() into a method
    nfs_remount(): don't leak, don't ignore LSM options quietly
    btrfs: sanitize security_mnt_opts use
    selinux; don't open-code a loop in sb_finish_set_opts()
    LSM: split ->sb_set_mnt_opts() out of ->sb_kern_mount()
    new helper: security_sb_eat_lsm_opts()
    ...

    Linus Torvalds
     

05 Jan, 2019

3 commits

  • Unpacking an external initrd may fail e.g. not enough memory. This
    leads to an incomplete rootfs because some files might be extracted
    already. Fixed by cleaning the rootfs so the kernel is not using an
    incomplete rootfs.

    Link: http://lkml.kernel.org/r/20181030151805.5519-1-david.engraf@sysgo.com
    Signed-off-by: David Engraf
    Cc: Dominik Brodowski
    Cc: Greg Kroah-Hartman
    Cc: Philippe Ombredanne
    Cc: Arnd Bergmann
    Cc: Luc Van Oostenryck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Engraf
     
  • We get a warning when building kernel with W=1:

    kernel/fork.c:167:13: warning: no previous prototype for `arch_release_thread_stack' [-Wmissing-prototypes]
    kernel/fork.c:779:13: warning: no previous prototype for `fork_init' [-Wmissing-prototypes]

    Add the missing declaration in head file to fix this.

    Also, remove arch_release_thread_stack() completely because no arch
    seems to implement it since bb9d81264 (arch: remove tile port).

    Link: http://lkml.kernel.org/r/1542170087-23645-1-git-send-email-wang.yi59@zte.com.cn
    Signed-off-by: Yi Wang
    Acked-by: Michal Hocko
    Acked-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yi Wang
     
  • Initcall names should not be changed.

    Link: http://lkml.kernel.org/r/20181124091829.GD10969@avx2
    Signed-off-by: Alexey Dobriyan
    Reviewed-by: Andrew Morton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

29 Dec, 2018

4 commits

  • Pull Devicetree updates from Rob Herring:
    "The biggest highlight here is the start of using json-schema for DT
    bindings. Being able to validate bindings has been discussed for years
    with little progress.

    - Initial support for DT bindings using json-schema language. This is
    the start of converting DT bindings from free-form text to a
    structured format.

    - Reworking of initrd address initialization. This moves to using the
    phys address instead of virt addr in the DT parsing code. This
    rework was motivated by CONFIG_DEV_BLK_INITRD causing unnecessary
    rebuilding of lots of files.

    - Fix stale phandle entries in phandle cache

    - DT overlay validation improvements. This exposed several memory
    leak bugs which have been fixed.

    - Use node name and device_type helper functions in DT code

    - Last remaining conversions to using %pOFn printk specifier instead
    of device_node.name directly

    - Create new common RTC binding doc and move all trivial RTC devices
    out of trivial-devices.txt.

    - New bindings for Freescale MAG3110 magnetometer, Cadence Sierra
    PHY, and Xen shared memory

    - Update dtc to upstream version v1.4.7-57-gf267e674d145"

    * tag 'devicetree-for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (68 commits)
    of: __of_detach_node() - remove node from phandle cache
    of: of_node_get()/of_node_put() nodes held in phandle cache
    gpio-omap.txt: add reg and interrupts properties
    dt-bindings: mrvl,intc: fix a trivial typo
    dt-bindings: iio: magnetometer: add dt-bindings for freescale mag3110
    dt-bindings: Convert trivial-devices.txt to json-schema
    dt-bindings: arm: mrvl: amend Browstone compatible string
    dt-bindings: arm: Convert Tegra board/soc bindings to json-schema
    dt-bindings: arm: Convert ZTE board/soc bindings to json-schema
    dt-bindings: arm: Add missing Xilinx boards
    dt-bindings: arm: Convert Xilinx board/soc bindings to json-schema
    dt-bindings: arm: Convert VIA board/soc bindings to json-schema
    dt-bindings: arm: Convert ST STi board/soc bindings to json-schema
    dt-bindings: arm: Convert SPEAr board/soc bindings to json-schema
    dt-bindings: arm: Convert CSR SiRF board/soc bindings to json-schema
    dt-bindings: arm: Convert QCom board/soc bindings to json-schema
    dt-bindings: arm: Convert TI nspire board/soc bindings to json-schema
    dt-bindings: arm: Convert TI davinci board/soc bindings to json-schema
    dt-bindings: arm: Convert Calxeda board/soc bindings to json-schema
    dt-bindings: arm: Convert Altera board/soc bindings to json-schema
    ...

    Linus Torvalds
     
  • Merge misc updates from Andrew Morton:

    - large KASAN update to use arm's "software tag-based mode"

    - a few misc things

    - sh updates

    - ocfs2 updates

    - just about all of MM

    * emailed patches from Andrew Morton : (167 commits)
    kernel/fork.c: mark 'stack_vm_area' with __maybe_unused
    memcg, oom: notify on oom killer invocation from the charge path
    mm, swap: fix swapoff with KSM pages
    include/linux/gfp.h: fix typo
    mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm
    hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
    hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
    memory_hotplug: add missing newlines to debugging output
    mm: remove __hugepage_set_anon_rmap()
    include/linux/vmstat.h: remove unused page state adjustment macro
    mm/page_alloc.c: allow error injection
    mm: migrate: drop unused argument of migrate_page_move_mapping()
    blkdev: avoid migration stalls for blkdev pages
    mm: migrate: provide buffer_migrate_page_norefs()
    mm: migrate: move migrate_page_lock_buffers()
    mm: migrate: lock buffers before migrate_page_move_mapping()
    mm: migration: factor out code to compute expected number of page references
    mm, page_alloc: enable pcpu_drain with zone capability
    kmemleak: add config to select auto scan
    mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init
    ...

    Linus Torvalds
     
  • Pull block updates from Jens Axboe:
    "This is the main pull request for block/storage for 4.21.

    Larger than usual, it was a busy round with lots of goodies queued up.
    Most notable is the removal of the old IO stack, which has been a long
    time coming. No new features for a while, everything coming in this
    week has all been fixes for things that were previously merged.

    This contains:

    - Use atomic counters instead of semaphores for mtip32xx (Arnd)

    - Cleanup of the mtip32xx request setup (Christoph)

    - Fix for circular locking dependency in loop (Jan, Tetsuo)

    - bcache (Coly, Guoju, Shenghui)
    * Optimizations for writeback caching
    * Various fixes and improvements

    - nvme (Chaitanya, Christoph, Sagi, Jay, me, Keith)
    * host and target support for NVMe over TCP
    * Error log page support
    * Support for separate read/write/poll queues
    * Much improved polling
    * discard OOM fallback
    * Tracepoint improvements

    - lightnvm (Hans, Hua, Igor, Matias, Javier)
    * Igor added packed metadata to pblk. Now drives without metadata
    per LBA can be used as well.
    * Fix from Geert on uninitialized value on chunk metadata reads.
    * Fixes from Hans and Javier to pblk recovery and write path.
    * Fix from Hua Su to fix a race condition in the pblk recovery
    code.
    * Scan optimization added to pblk recovery from Zhoujie.
    * Small geometry cleanup from me.

    - Conversion of the last few drivers that used the legacy path to
    blk-mq (me)

    - Removal of legacy IO path in SCSI (me, Christoph)

    - Removal of legacy IO stack and schedulers (me)

    - Support for much better polling, now without interrupts at all.
    blk-mq adds support for multiple queue maps, which enables us to
    have a map per type. This in turn enables nvme to have separate
    completion queues for polling, which can then be interrupt-less.
    Also means we're ready for async polled IO, which is hopefully
    coming in the next release.

    - Killing of (now) unused block exports (Christoph)

    - Unification of the blk-rq-qos and blk-wbt wait handling (Josef)

    - Support for zoned testing with null_blk (Masato)

    - sx8 conversion to per-host tag sets (Christoph)

    - IO priority improvements (Damien)

    - mq-deadline zoned fix (Damien)

    - Ref count blkcg series (Dennis)

    - Lots of blk-mq improvements and speedups (me)

    - sbitmap scalability improvements (me)

    - Make core inflight IO accounting per-cpu (Mikulas)

    - Export timeout setting in sysfs (Weiping)

    - Cleanup the direct issue path (Jianchao)

    - Export blk-wbt internals in block debugfs for easier debugging
    (Ming)

    - Lots of other fixes and improvements"

    * tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block: (364 commits)
    kyber: use sbitmap add_wait_queue/list_del wait helpers
    sbitmap: add helpers for add/del wait queue handling
    block: save irq state in blkg_lookup_create()
    dm: don't reuse bio for flushes
    nvme-pci: trace SQ status on completions
    nvme-rdma: implement polling queue map
    nvme-fabrics: allow user to pass in nr_poll_queues
    nvme-fabrics: allow nvmf_connect_io_queue to poll
    nvme-core: optionally poll sync commands
    block: make request_to_qc_t public
    nvme-tcp: fix spelling mistake "attepmpt" -> "attempt"
    nvme-tcp: fix endianess annotations
    nvmet-tcp: fix endianess annotations
    nvme-pci: refactor nvme_poll_irqdisable to make sparse happy
    nvme-pci: only set nr_maps to 2 if poll queues are supported
    nvmet: use a macro for default error location
    nvmet: fix comparison of a u16 with -1
    blk-mq: enable IO poll if .nr_queues of type poll > 0
    blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
    blk-mq: skip zero-queue maps in blk_mq_map_swqueue
    ...

    Linus Torvalds
     
  • The current value of the early boot static pool size, 1024 is not big
    enough for systems with large number of CPUs with timer or/and workqueue
    objects selected. As the results, systems have 60+ CPUs with both timer
    and workqueue objects enabled could trigger "ODEBUG: Out of memory.
    ODEBUG disabled".

    Some debug objects are allocated during the early boot. Enabling some
    options like timers or workqueue objects may increase the size required
    significantly with large number of CPUs. For example,

    CONFIG_DEBUG_OBJECTS_TIMERS:
    No. CPUs x 2 (worker pool) objects:
    start_kernel
    workqueue_init_early
    init_worker_pool
    init_timer_key
    debug_object_init

    plus No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
    sched_init
    hrtick_rq_init
    hrtimer_init

    CONFIG_DEBUG_OBJECTS_WORK:
    No. CPUs objects:
    vmalloc_init
    __init_work

    plus No. CPUs x 6 (workqueue) objects:
    workqueue_init_early
    alloc_workqueue
    __alloc_workqueue_key
    alloc_and_link_pwqs
    init_pwq

    Also, plus No. CPUs objects:
    perf_event_init
    __init_srcu_struct
    init_srcu_struct_fields
    init_srcu_struct_nodes
    __init_work

    However, none of the things are actually used or required before
    debug_objects_mem_init() is invoked, so just move the call right before
    vmalloc_init().

    According to tglx, "the reason why the call is at this place in
    start_kernel() is historical. It's because back in the days when
    debugobjects were added the memory allocator was enabled way later than
    today."

    Link: http://lkml.kernel.org/r/20181126102407.1836-1-cai@gmx.us
    Signed-off-by: Qian Cai
    Suggested-by: Thomas Gleixner
    Cc: Waiman Long
    Cc: Yang Shi
    Cc: Arnd Bergmann
    Cc: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     

28 Dec, 2018

1 commit

  • Pull audit updates from Paul Moore:
    "In the finest of holiday of traditions, I have a number of gifts to
    share today. While most of them are re-gifts from others, unlike the
    typical re-gift, these are things you will want in and around your
    tree; I promise.

    This pull request is perhaps a bit larger than our typical PR, but
    most of it comes from Jan's rework of audit's fanotify code; a very
    welcome improvement. We ran this through our normal regression tests,
    as well as some newly created stress tests and everything looks good.

    Richard added a few patches, mostly cleaning up a few things and and
    shortening some of the audit records that we send to userspace; a
    change the userspace folks are quite happy about.

    Finally YueHaibing and I kick in a few patches to simplify things a
    bit and make the code less prone to errors.

    Lastly, I want to say thanks one more time to everyone who has
    contributed patches, testing, and code reviews for the audit subsystem
    over the past year. The project is what it is due to your help and
    contributions - thank you"

    * tag 'audit-pr-20181224' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: (22 commits)
    audit: remove duplicated include from audit.c
    audit: shorten PATH cap values when zero
    audit: use current whenever possible
    audit: minimize our use of audit_log_format()
    audit: remove WATCH and TREE config options
    audit: use session_info helper
    audit: localize audit_log_session_info prototype
    audit: Use 'mark' name for fsnotify_mark variables
    audit: Replace chunk attached to mark instead of replacing mark
    audit: Simplify locking around untag_chunk()
    audit: Drop all unused chunk nodes during deletion
    audit: Guarantee forward progress of chunk untagging
    audit: Allocate fsnotify mark independently of chunk
    audit: Provide helper for dropping mark's chunk reference
    audit: Remove pointless check in insert_hash()
    audit: Factor out chunk replacement code
    audit: Make hash table insertion safe against concurrent lookups
    audit: Embed key into chunk
    audit: Fix possible tagging failures
    audit: Fix possible spurious -ENOSPC error
    ...

    Linus Torvalds
     

27 Dec, 2018

2 commits

  • Pull EFI updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Allocate the E820 buffer before doing the
    GetMemoryMap/ExitBootServices dance so we don't run out of space

    - Clear EFI boot services mappings when freeing the memory

    - Harden efivars against callers that invoke it on non-EFI boots

    - Reduce the number of memblock reservations resulting from extensive
    use of the new efi_mem_reserve_persistent() API

    - Other assorted fixes and cleanups"

    * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/efi: Don't unmap EFI boot services code/data regions for EFI_OLD_MEMMAP and EFI_MIXED_MODE
    efi: Reduce the amount of memblock reservations for persistent allocations
    efi: Permit multiple entries in persistent memreserve data structure
    efi/libstub: Disable some warnings for x86{,_64}
    x86/efi: Move efi__boot_services() to arch/x86
    x86/efi: Unmap EFI boot services code/data regions from efi_pgd
    x86/mm/pageattr: Introduce helper function to unmap EFI boot services
    efi/fdt: Simplify the get_fdt() flow
    efi/fdt: Indentation fix
    firmware/efi: Add NULL pointer checks in efivars API functions

    Linus Torvalds
     
  • Pull RCU updates from Ingo Molnar:
    "The biggest RCU changes in this cycle were:

    - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

    - Replace calls of RCU-bh and RCU-sched update-side functions to
    their vanilla RCU counterparts. This series is a step towards
    complete removal of the RCU-bh and RCU-sched update-side functions.

    ( Note that some of these conversions are going upstream via their
    respective maintainers. )

    - Documentation updates, including a number of flavor-consolidation
    updates from Joel Fernandes.

    - Miscellaneous fixes.

    - Automate generation of the initrd filesystem used for rcutorture
    testing.

    - Convert spin_is_locked() assertions to instead use lockdep.

    ( Note that some of these conversions are going upstream via their
    respective maintainers. )

    - SRCU updates, especially including a fix from Dennis Krein for a
    bag-on-head-class bug.

    - RCU torture-test updates"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits)
    rcutorture: Don't do busted forward-progress testing
    rcutorture: Use 100ms buckets for forward-progress callback histograms
    rcutorture: Recover from OOM during forward-progress tests
    rcutorture: Print forward-progress test age upon failure
    rcutorture: Print time since GP end upon forward-progress failure
    rcutorture: Print histogram of CB invocation at OOM time
    rcutorture: Print GP age upon forward-progress failure
    rcu: Print per-CPU callback counts for forward-progress failures
    rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings
    rcutorture: Dump grace-period diagnostics upon forward-progress OOM
    rcutorture: Prepare for asynchronous access to rcu_fwd_startat
    torture: Remove unnecessary "ret" variables
    rcutorture: Affinity forward-progress test to avoid housekeeping CPUs
    rcutorture: Break up too-long rcu_torture_fwd_prog() function
    rcutorture: Remove cbflood facility
    torture: Bring any extra CPUs online during kernel startup
    rcutorture: Add call_rcu() flooding forward-progress tests
    rcutorture/formal: Replace synchronize_sched() with synchronize_rcu()
    tools/kernel.h: Replace synchronize_sched() with synchronize_rcu()
    net/decnet: Replace rcu_barrier_bh() with rcu_barrier()
    ...

    Linus Torvalds
     

21 Dec, 2018

1 commit


15 Dec, 2018

1 commit

  • The kernel commandline parameter named in CONFIG_PSI_DEFAULT_DISABLED
    help text contradicts the documentation in kernel-parameters.txt, and
    the code. Fix that.

    Link: http://lkml.kernel.org/r/20181203213416.GA12627@cmpxchg.org
    Fixes: e0c274472d ("psi: make disabling/enabling easier for vendor kernels")
    Signed-off-by: Baruch Siach
    Acked-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Baruch Siach
     

05 Dec, 2018

1 commit

  • Pull in v4.20-rc5, solving a conflict we'll otherwise get in aio.c and
    also getting the merge fix that went into mainline that users are
    hitting testing for-4.21/block and/or for-next.

    * tag 'v4.20-rc5': (664 commits)
    Linux 4.20-rc5
    PCI: Fix incorrect value returned from pcie_get_speed_cap()
    MAINTAINERS: Update linux-mips mailing list address
    ocfs2: fix potential use after free
    mm/khugepaged: fix the xas_create_range() error path
    mm/khugepaged: collapse_shmem() do not crash on Compound
    mm/khugepaged: collapse_shmem() without freezing new_page
    mm/khugepaged: minor reorderings in collapse_shmem()
    mm/khugepaged: collapse_shmem() remember to clear holes
    mm/khugepaged: fix crashes due to misaccounted holes
    mm/khugepaged: collapse_shmem() stop if punched or truncated
    mm/huge_memory: fix lockdep complaint on 32-bit i_size_read()
    mm/huge_memory: splitting set mapping+index before unfreeze
    mm/huge_memory: rename freeze_page() to unmap_page()
    initramfs: clean old path before creating a hardlink
    kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace
    psi: make disabling/enabling easier for vendor kernels
    proc: fixup map_files test on arm
    debugobjects: avoid recursive calls with kmemleak
    userfaultfd: shmem: UFFDIO_COPY: set the page dirty if VM_WRITE is not set
    ...

    Jens Axboe
     

04 Dec, 2018

1 commit

  • …k/linux-rcu into core/rcu

    Pull RCU changes from Paul E. McKenney:

    - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

    - Replace calls of RCU-bh and RCU-sched update-side functions
    to their vanilla RCU counterparts. This series is a step
    towards complete removal of the RCU-bh and RCU-sched update-side
    functions.

    ( Note that some of these conversions are going upstream via their
    respective maintainers. )

    - Documentation updates, including a number of flavor-consolidation
    updates from Joel Fernandes.

    - Miscellaneous fixes.

    - Automate generation of the initrd filesystem used for
    rcutorture testing.

    - Convert spin_is_locked() assertions to instead use lockdep.

    ( Note that some of these conversions are going upstream via their
    respective maintainers. )

    - SRCU updates, especially including a fix from Dennis Krein
    for a bag-on-head-class bug.

    - RCU torture-test updates.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

01 Dec, 2018

2 commits

  • sys_link() can fail due to the new path already existing. This case
    ofen occurs when we use a concated initrd, for example:

    1) prepare a basic rootfs, it contains a regular files rc.local
    lizhijian@:~/yocto-tiny-i386-2016-04-22$ cat etc/rc.local
    #!/bin/sh
    echo "Running /etc/rc.local..."
    yocto-tiny-i386-2016-04-22$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rootfs.cgz

    2) create a extra initrd which also includes a etc/rc.local
    lizhijian@:~/lkp-x86_64/etc$ echo "append initrd" >rc.local
    lizhijian@:~/lkp/lkp-x86_64/etc$ cat rc.local
    append initrd
    lizhijian@:~/lkp/lkp-x86_64/etc$ ln rc.local rc.local.hardlink
    append initrd
    lizhijian@:~/lkp/lkp-x86_64/etc$ stat rc.local rc.local.hardlink
    File: 'rc.local'
    Size: 14 Blocks: 8 IO Block: 4096 regular file
    Device: 801h/2049d Inode: 11296086 Links: 2
    Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian)
    Access: 2018-11-15 16:08:28.654464815 +0800
    Modify: 2018-11-15 16:07:57.514903210 +0800
    Change: 2018-11-15 16:08:24.180228872 +0800
    Birth: -
    File: 'rc.local.hardlink'
    Size: 14 Blocks: 8 IO Block: 4096 regular file
    Device: 801h/2049d Inode: 11296086 Links: 2
    Access: (0664/-rw-rw-r--) Uid: ( 1002/lizhijian) Gid: ( 1002/lizhijian)
    Access: 2018-11-15 16:08:28.654464815 +0800
    Modify: 2018-11-15 16:07:57.514903210 +0800
    Change: 2018-11-15 16:08:24.180228872 +0800
    Birth: -

    lizhijian@:~/lkp/lkp-x86_64$ find . | sed 's,^\./,,' | cpio -o -H newc | gzip -n -9 >../rc-local.cgz
    lizhijian@:~/lkp/lkp-x86_64$ gzip -dc ../rc-local.cgz | cpio -t
    .
    etc
    etc/rc.local.hardlink <<< it will be extracted first at this initrd
    etc/rc.local

    3) concate 2 initrds and boot
    lizhijian@:~/lkp$ cat rootfs.cgz rc-local.cgz >concate-initrd.cgz
    lizhijian@:~/lkp$ qemu-system-x86_64 -nographic -enable-kvm -cpu host -smp 1 -m 1024 -kernel ~/lkp/linux/arch/x86/boot/bzImage -append "console=ttyS0 earlyprint=ttyS0 ignore_loglevel" -initrd ./concate-initr.cgz -serial stdio -nodefaults

    In this case, sys_link(2) will fail and return -EEXIST, so we can only get
    the rc.local at rootfs.cgz instead of rc-local.cgz

    [akpm@linux-foundation.org: move code to avoid forward declaration]
    Link: http://lkml.kernel.org/r/1542352368-13299-1-git-send-email-lizhijian@cn.fujitsu.com
    Signed-off-by: Li Zhijian
    Cc: Philip Li
    Cc: Dominik Brodowski
    Cc: Li Zhijian
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zhijian
     
  • Mel Gorman reports a hackbench regression with psi that would prohibit
    shipping the suse kernel with it default-enabled, but he'd still like
    users to be able to opt in at little to no cost to others.

    With the current combination of CONFIG_PSI and the psi_disabled bool set
    from the commandline, this is a challenge. Do the following things to
    make it easier:

    1. Add a config option CONFIG_PSI_DEFAULT_DISABLED that allows distros
    to enable CONFIG_PSI in their kernel but leave the feature disabled
    unless a user requests it at boot-time.

    To avoid double negatives, rename psi_disabled= to psi=.

    2. Make psi_disabled a static branch to eliminate any branch costs
    when the feature is disabled.

    In terms of numbers before and after this patch, Mel says:

    : The following is a comparision using CONFIG_PSI=n as a baseline against
    : your patch and a vanilla kernel
    :
    : 4.20.0-rc4 4.20.0-rc4 4.20.0-rc4
    : kconfigdisable-v1r1 vanilla psidisable-v1r1
    : Amean 1 1.3100 ( 0.00%) 1.3923 ( -6.28%) 1.3427 ( -2.49%)
    : Amean 3 3.8860 ( 0.00%) 4.1230 * -6.10%* 3.8860 ( -0.00%)
    : Amean 5 6.8847 ( 0.00%) 8.0390 * -16.77%* 6.7727 ( 1.63%)
    : Amean 7 9.9310 ( 0.00%) 10.8367 * -9.12%* 9.9910 ( -0.60%)
    : Amean 12 16.6577 ( 0.00%) 18.2363 * -9.48%* 17.1083 ( -2.71%)
    : Amean 18 26.5133 ( 0.00%) 27.8833 * -5.17%* 25.7663 ( 2.82%)
    : Amean 24 34.3003 ( 0.00%) 34.6830 ( -1.12%) 32.0450 ( 6.58%)
    : Amean 30 40.0063 ( 0.00%) 40.5800 ( -1.43%) 41.5087 ( -3.76%)
    : Amean 32 40.1407 ( 0.00%) 41.2273 ( -2.71%) 39.9417 ( 0.50%)
    :
    : It's showing that the vanilla kernel takes a hit (as the bisection
    : indicated it would) and that disabling PSI by default is reasonably
    : close in terms of performance for this particular workload on this
    : particular machine so;

    Link: http://lkml.kernel.org/r/20181127165329.GA29728@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Tested-by: Mel Gorman
    Reported-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

30 Nov, 2018

1 commit

  • efi__boot_services() are x86 specific quirks and as such
    should be in asm/efi.h, so move them from linux/efi.h. Also, call
    efi_free_boot_services() from __efi_enter_virtual_mode() as it is x86
    specific call and ideally shouldn't be part of init/main.c

    Signed-off-by: Sai Praneeth Prakhya
    Signed-off-by: Ard Biesheuvel
    Acked-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Arend van Spriel
    Cc: Bhupesh Sharma
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: Eric Snowberg
    Cc: Hans de Goede
    Cc: Joe Perches
    Cc: Jon Hunter
    Cc: Julien Thierry
    Cc: Linus Torvalds
    Cc: Marc Zyngier
    Cc: Matt Fleming
    Cc: Nathan Chancellor
    Cc: Peter Zijlstra
    Cc: Sedat Dilek
    Cc: YiFei Zhu
    Cc: linux-efi@vger.kernel.org
    Link: http://lkml.kernel.org/r/20181129171230.18699-7-ard.biesheuvel@linaro.org
    Signed-off-by: Ingo Molnar

    Sai Praneeth Prakhya
     

28 Nov, 2018

1 commit


27 Nov, 2018

2 commits

  • ARC, ARM, ARM64 and Unicore32 are all capable of parsing the "initrd="
    command line parameter to allow specifying the physical address and size
    of an initrd. Move that parsing into init/do_mounts_initrd.c such that
    we no longer duplicate that logic.

    Signed-off-by: Florian Fainelli
    Reviewed-by: Mike Rapoport
    Signed-off-by: Rob Herring

    Florian Fainelli
     
  • Make phys_initrd_start and phys_initrd_size global variables declared in
    init/do_mounts_initrd.c such that we can later have generic code in
    drivers/of/fdt.c populate those variables for us.

    This requires both the ARM and unicore32 implementations to be properly
    guarded against CONFIG_BLK_DEV_INITRD, and also initialize the variables
    to the expected default values (unicore32).

    Signed-off-by: Florian Fainelli
    Reviewed-by: Mike Rapoport
    Signed-off-by: Rob Herring

    Florian Fainelli
     

20 Nov, 2018

1 commit


08 Nov, 2018

1 commit

  • This removes a bunch of core and elevator related code. On the core
    front, we remove anything related to queue running, draining,
    initialization, plugging, and congestions. We also kill anything
    related to request allocation, merging, retrieval, and completion.

    Remove any checking for single queue IO schedulers, as they no
    longer exist. This means we can also delete a bunch of code related
    to request issue, adding, completion, etc - and all the SQ related
    ops and helpers.

    Also kill the load_default_modules(), as all that did was provide
    for a way to load the default single queue elevator.

    Tested-by: Ming Lei
    Reviewed-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Jens Axboe
     

31 Oct, 2018

5 commits

  • When a memblock allocation APIs are called with align = 0, the alignment
    is implicitly set to SMP_CACHE_BYTES.

    Implicit alignment is done deep in the memblock allocator and it can
    come as a surprise. Not that such an alignment would be wrong even
    when used incorrectly but it is better to be explicit for the sake of
    clarity and the prinicple of the least surprise.

    Replace all such uses of memblock APIs with the 'align' parameter
    explicitly set to SMP_CACHE_BYTES and stop implicit alignment assignment
    in the memblock internal allocation functions.

    For the case when memblock APIs are used via helper functions, e.g. like
    iommu_arena_new_node() in Alpha, the helper functions were detected with
    Coccinelle's help and then manually examined and updated where
    appropriate.

    The direct memblock APIs users were updated using the semantic patch below:

    @@
    expression size, min_addr, max_addr, nid;
    @@
    (
    |
    - memblock_alloc_try_nid_raw(size, 0, min_addr, max_addr, nid)
    + memblock_alloc_try_nid_raw(size, SMP_CACHE_BYTES, min_addr, max_addr,
    nid)
    |
    - memblock_alloc_try_nid_nopanic(size, 0, min_addr, max_addr, nid)
    + memblock_alloc_try_nid_nopanic(size, SMP_CACHE_BYTES, min_addr, max_addr,
    nid)
    |
    - memblock_alloc_try_nid(size, 0, min_addr, max_addr, nid)
    + memblock_alloc_try_nid(size, SMP_CACHE_BYTES, min_addr, max_addr, nid)
    |
    - memblock_alloc(size, 0)
    + memblock_alloc(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_raw(size, 0)
    + memblock_alloc_raw(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_from(size, 0, min_addr)
    + memblock_alloc_from(size, SMP_CACHE_BYTES, min_addr)
    |
    - memblock_alloc_nopanic(size, 0)
    + memblock_alloc_nopanic(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_low(size, 0)
    + memblock_alloc_low(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_low_nopanic(size, 0)
    + memblock_alloc_low_nopanic(size, SMP_CACHE_BYTES)
    |
    - memblock_alloc_from_nopanic(size, 0, min_addr)
    + memblock_alloc_from_nopanic(size, SMP_CACHE_BYTES, min_addr)
    |
    - memblock_alloc_node(size, 0, nid)
    + memblock_alloc_node(size, SMP_CACHE_BYTES, nid)
    )

    [mhocko@suse.com: changelog update]
    [akpm@linux-foundation.org: coding-style fixes]
    [rppt@linux.ibm.com: fix missed uses of implicit alignment]
    Link: http://lkml.kernel.org/r/20181016133656.GA10925@rapoport-lnx
    Link: http://lkml.kernel.org/r/1538687224-17535-1-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Suggested-by: Michal Hocko
    Acked-by: Paul Burton [MIPS]
    Acked-by: Michael Ellerman [powerpc]
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: Geert Uytterhoeven
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: Matt Turner
    Cc: Michal Simek
    Cc: Richard Weinberger
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The alloc_bootmem(size) is a shortcut for allocation of SMP_CACHE_BYTES
    aligned memory. When the align parameter of memblock_alloc() is 0, the
    alignment is implicitly set to SMP_CACHE_BYTES and thus alloc_bootmem(size)
    and memblock_alloc(size, 0) are equivalent.

    The conversion is done using the following semantic patch:

    @@
    expression size;
    @@
    - alloc_bootmem(size)
    + memblock_alloc(size, 0)

    Link: http://lkml.kernel.org/r/1536927045-23536-22-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The conversion is done using

    sed -i 's@memblock_virt_alloc@memblock_alloc@g' \
    $(git grep -l memblock_virt_alloc)

    Link: http://lkml.kernel.org/r/1536927045-23536-8-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Support referencing the root partition label from GPT as argument
    to the root= option on the kernel command line in analogy to
    referencing the partition uuid as root=PARTUUID=.

    Specifying the partition label instead of the uuid is often much
    easier, e.g. in embedded environments when there is an
    A/B rootfs partition scheme for interruptible firmware updates
    (i.e. rootfsA/ rootfsB).

    The partition label can be queried with the blkid command.

    Link: http://lkml.kernel.org/r/20180822060904.828E510665E@pc-niv.weinmann.com
    Signed-off-by: Nikolaus Voss
    Reviewed-by: Andrew Morton
    Cc: Dominik Brodowski
    Cc: Sasha Levin
    Cc: Al Viro
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nikolaus Voss
     

27 Oct, 2018

2 commits

  • On a system that executes multiple cgrouped jobs and independent
    workloads, we don't just care about the health of the overall system, but
    also that of individual jobs, so that we can ensure individual job health,
    fairness between jobs, or prioritize some jobs over others.

    This patch implements pressure stall tracking for cgroups. In kernels
    with CONFIG_PSI=y, cgroup2 groups will have cpu.pressure, memory.pressure,
    and io.pressure files that track aggregate pressure stall times for only
    the tasks inside the cgroup.

    Link: http://lkml.kernel.org/r/20180828172258.3185-10-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Acked-by: Tejun Heo
    Acked-by: Peter Zijlstra (Intel)
    Tested-by: Daniel Drake
    Tested-by: Suren Baghdasaryan
    Cc: Christopher Lameter
    Cc: Ingo Molnar
    Cc: Johannes Weiner
    Cc: Mike Galbraith
    Cc: Peter Enderborg
    Cc: Randy Dunlap
    Cc: Shakeel Butt
    Cc: Vinayak Menon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • When systems are overcommitted and resources become contended, it's hard
    to tell exactly the impact this has on workload productivity, or how close
    the system is to lockups and OOM kills. In particular, when machines work
    multiple jobs concurrently, the impact of overcommit in terms of latency
    and throughput on the individual job can be enormous.

    In order to maximize hardware utilization without sacrificing individual
    job health or risk complete machine lockups, this patch implements a way
    to quantify resource pressure in the system.

    A kernel built with CONFIG_PSI=y creates files in /proc/pressure/ that
    expose the percentage of time the system is stalled on CPU, memory, or IO,
    respectively. Stall states are aggregate versions of the per-task delay
    accounting delays:

    cpu: some tasks are runnable but not executing on a CPU
    memory: tasks are reclaiming, or waiting for swapin or thrashing cache
    io: tasks are waiting for io completions

    These percentages of walltime can be thought of as pressure percentages,
    and they give a general sense of system health and productivity loss
    incurred by resource overcommit. They can also indicate when the system
    is approaching lockup scenarios and OOMs.

    To do this, psi keeps track of the task states associated with each CPU
    and samples the time they spend in stall states. Every 2 seconds, the
    samples are averaged across CPUs - weighted by the CPUs' non-idle time to
    eliminate artifacts from unused CPUs - and translated into percentages of
    walltime. A running average of those percentages is maintained over 10s,
    1m, and 5m periods (similar to the loadaverage).

    [hannes@cmpxchg.org: doc fixlet, per Randy]
    Link: http://lkml.kernel.org/r/20180828205625.GA14030@cmpxchg.org
    [hannes@cmpxchg.org: code optimization]
    Link: http://lkml.kernel.org/r/20180907175015.GA8479@cmpxchg.org
    [hannes@cmpxchg.org: rename psi_clock() to psi_update_work(), per Peter]
    Link: http://lkml.kernel.org/r/20180907145404.GB11088@cmpxchg.org
    [hannes@cmpxchg.org: fix build]
    Link: http://lkml.kernel.org/r/20180913014222.GA2370@cmpxchg.org
    Link: http://lkml.kernel.org/r/20180828172258.3185-9-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Acked-by: Peter Zijlstra (Intel)
    Tested-by: Daniel Drake
    Tested-by: Suren Baghdasaryan
    Cc: Christopher Lameter
    Cc: Ingo Molnar
    Cc: Johannes Weiner
    Cc: Mike Galbraith
    Cc: Peter Enderborg
    Cc: Randy Dunlap
    Cc: Shakeel Butt
    Cc: Tejun Heo
    Cc: Vinayak Menon
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

23 Oct, 2018

2 commits

  • Pull scheduler updates from Ingo Molnar:
    "The main changes are:

    - Migrate CPU-intense 'misfit' tasks on asymmetric capacity systems,
    to better utilize (much) faster 'big core' CPUs. (Morten Rasmussen,
    Valentin Schneider)

    - Topology handling improvements, in particular when CPU capacity
    changes and related load-balancing fixes/improvements (Morten
    Rasmussen)

    - ... plus misc other improvements, fixes and updates"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
    sched/completions/Documentation: Add recommendation for dynamic and ONSTACK completions
    sched/completions/Documentation: Clean up the document some more
    sched/completions/Documentation: Fix a couple of punctuation nits
    cpu/SMT: State SMT is disabled even with nosmt and without "=force"
    sched/core: Fix comment regarding nr_iowait_cpu() and get_iowait_load()
    sched/fair: Remove setting task's se->runnable_weight during PELT update
    sched/fair: Disable LB_BIAS by default
    sched/pelt: Fix warning and clean up IRQ PELT config
    sched/topology: Make local variables static
    sched/debug: Use symbolic names for task state constants
    sched/numa: Remove unused numa_stats::nr_running field
    sched/numa: Remove unused code from update_numa_stats()
    sched/debug: Explicitly cast sched_feat() to bool
    sched/core: Disable SD_PREFER_SIBLING on asymmetric CPU capacity domains
    sched/fair: Don't move tasks to lower capacity CPUs unless necessary
    sched/fair: Set rq->rd->overload when misfit
    sched/fair: Wrap rq->rd->overload accesses with READ/WRITE_ONCE()
    sched/core: Change root_domain->overload type to int
    sched/fair: Change 'prefer_sibling' type to bool
    sched/fair: Kick nohz balance if rq->misfit_task_load
    ...

    Linus Torvalds
     
  • Pull locking and misc x86 updates from Ingo Molnar:
    "Lots of changes in this cycle - in part because locking/core attracted
    a number of related x86 low level work which was easier to handle in a
    single tree:

    - Linux Kernel Memory Consistency Model updates (Alan Stern, Paul E.
    McKenney, Andrea Parri)

    - lockdep scalability improvements and micro-optimizations (Waiman
    Long)

    - rwsem improvements (Waiman Long)

    - spinlock micro-optimization (Matthew Wilcox)

    - qspinlocks: Provide a liveness guarantee (more fairness) on x86.
    (Peter Zijlstra)

    - Add support for relative references in jump tables on arm64, x86
    and s390 to optimize jump labels (Ard Biesheuvel, Heiko Carstens)

    - Be a lot less permissive on weird (kernel address) uaccess faults
    on x86: BUG() when uaccess helpers fault on kernel addresses (Jann
    Horn)

    - macrofy x86 asm statements to un-confuse the GCC inliner. (Nadav
    Amit)

    - ... and a handful of other smaller changes as well"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
    locking/lockdep: Make global debug_locks* variables read-mostly
    locking/lockdep: Fix debug_locks off performance problem
    locking/pvqspinlock: Extend node size when pvqspinlock is configured
    locking/qspinlock_stat: Count instances of nested lock slowpaths
    locking/qspinlock, x86: Provide liveness guarantee
    x86/asm: 'Simplify' GEN_*_RMWcc() macros
    locking/qspinlock: Rework some comments
    locking/qspinlock: Re-order code
    locking/lockdep: Remove duplicated 'lock_class_ops' percpu array
    x86/defconfig: Enable CONFIG_USB_XHCI_HCD=y
    futex: Replace spin_is_locked() with lockdep
    locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y
    x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs
    x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs
    x86/extable: Macrofy inline assembly code to work around GCC inlining bugs
    x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops
    x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs
    x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs
    x86/refcount: Work around GCC inlining bug
    x86/objtool: Use asm macros to work around GCC inlining bugs
    ...

    Linus Torvalds
     

09 Oct, 2018

1 commit

  • With CONFIG_VMAP_STACK=y the kernel stack of all tasks should be
    allocated in the vmalloc space. The initial stack used for all
    the early init code is in the init_thread_union. To be able to
    switch from this early stack to a properly allocated stack
    from vmalloc the architecture needs a switch-over point.

    Introduce the arch_call_rest_init() function with a weak definition
    in init/main.c with the only purpose to call rest_init() from the
    end of start_kernel(). The architecture override can then do the
    necessary magic to switch to the new vmalloc'ed stack.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

02 Oct, 2018

1 commit

  • Create a config for enabling irq load tracking in the scheduler.
    irq load tracking is useful only when irq or paravirtual time is
    accounted but it's only possible with SMP for now.

    Also use __maybe_unused to remove the compilation warning in
    update_rq_clock_task() that has been introduced by:

    2e62c4743adc ("sched/fair: Remove #ifdefs from scale_rt_capacity()")

    Suggested-by: Ingo Molnar
    Reported-by: Dou Liyang
    Reported-by: Miguel Ojeda
    Signed-off-by: Vincent Guittot
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bp@alien8.de
    Cc: dou_liyang@163.com
    Fixes: 2e62c4743adc ("sched/fair: Remove #ifdefs from scale_rt_capacity()")
    Link: http://lkml.kernel.org/r/1537867062-27285-1-git-send-email-vincent.guittot@linaro.org
    Signed-off-by: Ingo Molnar

    Vincent Guittot
     

27 Sep, 2018

1 commit

  • Jump table entries are mostly read-only, with the exception of the
    init and module loader code that defuses entries that point into init
    code when the code being referred to is freed.

    For robustness, it would be better to move these entries into the
    ro_after_init section, but clearing the 'code' member of each jump
    table entry referring to init code at module load time races with the
    module_enable_ro() call that remaps the ro_after_init section read
    only, so we'd like to do it earlier.

    So given that whether such an entry refers to init code can be decided
    much earlier, we can pull this check forward. Since we may still need
    the code entry at this point, let's switch to setting a low bit in the
    'key' member just like we do to annotate the default state of a jump
    table entry.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kees Cook
    Acked-by: Peter Zijlstra (Intel)
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Will Deacon
    Cc: Catalin Marinas
    Cc: Steven Rostedt
    Cc: Martin Schwidefsky
    Cc: Jessica Yu
    Link: https://lkml.kernel.org/r/20180919065144.25010-8-ard.biesheuvel@linaro.org

    Ard Biesheuvel
     

26 Aug, 2018

1 commit

  • Pull more Kbuild updates from Masahiro Yamada:

    - add build_{menu,n,g,x}config targets for compile-testing Kconfig

    - fix and improve recursive dependency detection in Kconfig

    - fix parallel building of menuconfig/nconfig

    - fix syntax error in clang-version.sh

    - suppress distracting log from syncconfig

    - remove obsolete "rpm" target

    - remove VMLINUX_SYMBOL(_STR) macro entirely

    - fix microblaze build with CONFIG_DYNAMIC_FTRACE

    - move compiler test for dead code/data elimination to Kconfig

    - rename well-known LDFLAGS variable to KBUILD_LDFLAGS

    - misc fixes and cleanups

    * tag 'kbuild-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kbuild: rename LDFLAGS to KBUILD_LDFLAGS
    kbuild: pass LDFLAGS to recordmcount.pl
    kbuild: test dead code/data elimination support in Kconfig
    initramfs: move gen_initramfs_list.sh from scripts/ to usr/
    vmlinux.lds.h: remove stale include
    export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR()
    Coccinelle: remove pci_alloc_consistent semantic to detect in zalloc-simple.cocci
    kbuild: make sorting initramfs contents independent of locale
    kbuild: remove "rpm" target, which is alias of "rpm-pkg"
    kbuild: Fix LOADLIBES rename in Documentation/kbuild/makefiles.txt
    kconfig: suppress "configuration written to .config" for syncconfig
    kconfig: fix "Can't open ..." in parallel build
    kbuild: Add a space after `!` to prevent parsing as file pattern
    scripts: modpost: check memory allocation results
    kconfig: improve the recursive dependency report
    kconfig: report recursive dependency involving 'imply'
    kconfig: error out when seeing recursive dependency
    kconfig: add build-only configurator targets
    scripts/dtc: consolidate include path options in Makefile

    Linus Torvalds
     

24 Aug, 2018

1 commit


23 Aug, 2018

2 commits

  • Merge more updates from Andrew Morton:

    - the rest of MM

    - procfs updates

    - various misc things

    - more y2038 fixes

    - get_maintainer updates

    - lib/ updates

    - checkpatch updates

    - various epoll updates

    - autofs updates

    - hfsplus

    - some reiserfs work

    - fatfs updates

    - signal.c cleanups

    - ipc/ updates

    * emailed patches from Andrew Morton : (166 commits)
    ipc/util.c: update return value of ipc_getref from int to bool
    ipc/util.c: further variable name cleanups
    ipc: simplify ipc initialization
    ipc: get rid of ids->tables_initialized hack
    lib/rhashtable: guarantee initial hashtable allocation
    lib/rhashtable: simplify bucket_table_alloc()
    ipc: drop ipc_lock()
    ipc/util.c: correct comment in ipc_obtain_object_check
    ipc: rename ipcctl_pre_down_nolock()
    ipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()
    ipc: reorganize initialization of kern_ipc_perm.seq
    ipc: compute kern_ipc_perm.id under the ipc lock
    init/Kconfig: remove EXPERT from CHECKPOINT_RESTORE
    fs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp
    adfs: use timespec64 for time conversion
    kernel/sysctl.c: fix typos in comments
    drivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md
    fork: don't copy inconsistent signal handler state to child
    signal: make get_signal() return bool
    signal: make sigkill_pending() return bool
    ...

    Linus Torvalds
     
  • The CHECKPOINT_RESTORE configuration option was introduced in 2012 and
    combined with EXPERT. CHECKPOINT_RESTORE is already enabled in many
    distribution kernels and also part of the defconfigs of various
    architectures.

    To make it easier for distributions to enable CHECKPOINT_RESTORE this
    removes EXPERT and moves the configuration option out of the EXPERT block.

    Link: http://lkml.kernel.org/r/20180712130733.11510-1-adrian@lisas.de
    Signed-off-by: Adrian Reber
    Acked-by: Oleg Nesterov
    Reviewed-by: Hendrik Brueckner
    Acked-by: Pavel Emelyanov
    Cc: Eric W. Biederman
    Cc: Andrei Vagin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Reber