20 Jul, 2019

1 commit

  • Pull vfs mount updates from Al Viro:
    "The first part of mount updates.

    Convert filesystems to use the new mount API"

    * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    mnt_init(): call shmem_init() unconditionally
    constify ksys_mount() string arguments
    don't bother with registering rootfs
    init_rootfs(): don't bother with init_ramfs_fs()
    vfs: Convert smackfs to use the new mount API
    vfs: Convert selinuxfs to use the new mount API
    vfs: Convert securityfs to use the new mount API
    vfs: Convert apparmorfs to use the new mount API
    vfs: Convert openpromfs to use the new mount API
    vfs: Convert xenfs to use the new mount API
    vfs: Convert gadgetfs to use the new mount API
    vfs: Convert oprofilefs to use the new mount API
    vfs: Convert ibmasmfs to use the new mount API
    vfs: Convert qib_fs/ipathfs to use the new mount API
    vfs: Convert efivarfs to use the new mount API
    vfs: Convert configfs to use the new mount API
    vfs: Convert binfmt_misc to use the new mount API
    convenience helper: get_tree_single()
    convenience helper get_tree_nodev()
    vfs: Kill sget_userns()
    ...

    Linus Torvalds
     

17 Jul, 2019

2 commits

  • Merge more updates from Andrew Morton:
    "VM:
    - z3fold fixes and enhancements by Henry Burns and Vitaly Wool

    - more accurate reclaimed slab caches calculations by Yafang Shao

    - fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by
    Christoph Hellwig

    - !CONFIG_MMU fixes by Christoph Hellwig

    - new novmcoredd parameter to omit device dumps from vmcore, by
    Kairui Song

    - new test_meminit module for testing heap and pagealloc
    initialization, by Alexander Potapenko

    - ioremap improvements for huge mappings, by Anshuman Khandual

    - generalize kprobe page fault handling, by Anshuman Khandual

    - device-dax hotplug fixes and improvements, by Pavel Tatashin

    - enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V

    - add pte_devmap() support for arm64, by Robin Murphy

    - unify locked_vm accounting with a helper, by Daniel Jordan

    - several misc fixes

    core/lib:
    - new typeof_member() macro including some users, by Alexey Dobriyan

    - make BIT() and GENMASK() available in asm, by Masahiro Yamada

    - changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better
    code generation, by Alexey Dobriyan

    - rbtree code size optimizations, by Michel Lespinasse

    - convert struct pid count to refcount_t, by Joel Fernandes

    get_maintainer.pl:
    - add --no-moderated switch to skip moderated ML's, by Joe Perches

    misc:
    - ptrace PTRACE_GET_SYSCALL_INFO interface

    - coda updates

    - gdb scripts, various"

    [ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ]

    * emailed patches from Andrew Morton : (100 commits)
    fs/select.c: use struct_size() in kmalloc()
    mm: add account_locked_vm utility function
    arm64: mm: implement pte_devmap support
    mm: introduce ARCH_HAS_PTE_DEVMAP
    mm: clean up is_device_*_page() definitions
    mm/mmap: move common defines to mman-common.h
    mm: move MAP_SYNC to asm-generic/mman-common.h
    device-dax: "Hotremove" persistent memory that is used like normal RAM
    mm/hotplug: make remove_memory() interface usable
    device-dax: fix memory and resource leak if hotplug fails
    include/linux/lz4.h: fix spelling and copy-paste errors in documentation
    ipc/mqueue.c: only perform resource calculation if user valid
    include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
    scripts/gdb: add helpers to find and list devices
    scripts/gdb: add lx-genpd-summary command
    drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
    kernel/pid.c: convert struct pid count to refcount_t
    drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
    select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining()
    select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR
    ...

    Linus Torvalds
     
  • This fixes a couple typos I noticed in the slab Kconfig:

    sacrifies -> sacrifices
    accellerate -> accelerate

    Seeing as no other instances of these typos are found elsewhere in the
    kernel and that I originally added one of the two, I can only assume
    working on slab must have caused damage to the spelling centers of my
    brain.

    Link: http://lkml.kernel.org/r/201905292203.CD000546EB@keescook
    Signed-off-by: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

15 Jul, 2019

2 commits


13 Jul, 2019

2 commits

  • Pull Kbuild updates from Masahiro Yamada:

    - remove headers_{install,check}_all targets

    - remove unreasonable 'depends on !UML' from CONFIG_SAMPLES

    - re-implement 'make headers_install' more cleanly

    - add new header-test-y syntax to compile-test headers

    - compile-test exported headers to ensure they are compilable in
    user-space

    - compile-test headers under include/ to ensure they are self-contained

    - remove -Waggregate-return, -Wno-uninitialized, -Wno-unused-value
    flags

    - add -Werror=unknown-warning-option for Clang

    - add 128-bit built-in types support to genksyms

    - fix missed rebuild of modules.builtin

    - propagate 'No space left on device' error in fixdep to Make

    - allow Clang to use its integrated assembler

    - improve some coccinelle scripts

    - add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute
    path for $(srctree).

    - do not ignore errors when compression utility is missing

    - misc cleanups

    * tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (49 commits)
    kbuild: use -- separater intead of $(filter-out ...) for cc-cross-prefix
    kbuild: Inform user to pass ARCH= for make mrproper
    kbuild: fix compression errors getting ignored
    kbuild: add a flag to force absolute path for srctree
    kbuild: replace KBUILD_SRCTREE with boolean building_out_of_srctree
    kbuild: remove src and obj from the top Makefile
    scripts/tags.sh: remove unused environment variables from comments
    scripts/tags.sh: drop SUBARCH support for ARM
    kbuild: compile-test kernel headers to ensure they are self-contained
    kheaders: include only headers into kheaders_data.tar.xz
    kheaders: remove meaningless -R option of 'ls'
    kbuild: support header-test-pattern-y
    kbuild: do not create wrappers for header-test-y
    kbuild: compile-test exported headers to ensure they are self-contained
    init/Kconfig: add CONFIG_CC_CAN_LINK
    kallsyms: exclude kasan local symbols on s390
    kbuild: add more hints about SUBDIRS replacement
    coccinelle: api/stream_open: treat all wait_.*() calls as blocking
    coccinelle: put_device: Add a cast to an expression for an assignment
    coccinelle: put_device: Adjust a message construction
    ...

    Linus Torvalds
     
  • Print the currently enabled stack and heap initialization modes.

    Stack initialization is enabled by a config flag, while heap
    initialization is configured at boot time with defaults being set in the
    config. It's more convenient for the user to have all information about
    these hardening measures in one place at boot, so the user can reason
    about the expected behavior of the running system.

    The possible options for stack are:
    - "all" for CONFIG_INIT_STACK_ALL;
    - "byref_all" for CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL;
    - "byref" for CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF;
    - "__user" for CONFIG_GCC_PLUGIN_STRUCTLEAK_USER;
    - "off" otherwise.

    Depending on the values of init_on_alloc and init_on_free boottime options
    we also report "heap alloc" and "heap free" as "on"/"off".

    In the init_on_free mode initializing pages at boot time may take a while,
    so print a notice about that as well. This depends on how much memory is
    installed, the memory bandwidth, etc. On a relatively modern x86 system,
    it takes about 0.75s/GB to wipe all memory:

    [ 0.418722] mem auto-init: stack:byref_all, heap alloc:off, heap free:on
    [ 0.419765] mem auto-init: clearing system memory may take some time...
    [ 12.376605] Memory: 16408564K/16776672K available (14339K kernel code, 1397K rwdata, 3756K rodata, 1636K init, 11460K bss, 368108K reserved, 0K cma-reserved)

    Link: http://lkml.kernel.org/r/20190617151050.92663-3-glider@google.com
    Signed-off-by: Alexander Potapenko
    Suggested-by: Kees Cook
    Acked-by: Kees Cook
    Cc: Christoph Lameter
    Cc: Dmitry Vyukov
    Cc: James Morris
    Cc: Jann Horn
    Cc: Kostya Serebryany
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Masahiro Yamada
    Cc: Matthew Wilcox
    Cc: Nick Desaulniers
    Cc: Randy Dunlap
    Cc: Sandeep Patil
    Cc: "Serge E. Hallyn"
    Cc: Souptick Joarder
    Cc: Marco Elver
    Cc: Kaiwan N Billimoria
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     

10 Jul, 2019

2 commits

  • Pull Documentation updates from Jonathan Corbet:
    "It's been a relatively busy cycle for docs:

    - A fair pile of RST conversions, many from Mauro. These create more
    than the usual number of simple but annoying merge conflicts with
    other trees, unfortunately. He has a lot more of these waiting on
    the wings that, I think, will go to you directly later on.

    - A new document on how to use merges and rebases in kernel repos,
    and one on Spectre vulnerabilities.

    - Various improvements to the build system, including automatic
    markup of function() references because some people, for reasons I
    will never understand, were of the opinion that
    :c:func:``function()`` is unattractive and not fun to type.

    - We now recommend using sphinx 1.7, but still support back to 1.4.

    - Lots of smaller improvements, warning fixes, typo fixes, etc"

    * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits)
    docs: automarkup.py: ignore exceptions when seeking for xrefs
    docs: Move binderfs to admin-guide
    Disable Sphinx SmartyPants in HTML output
    doc: RCU callback locks need only _bh, not necessarily _irq
    docs: format kernel-parameters -- as code
    Doc : doc-guide : Fix a typo
    platform: x86: get rid of a non-existent document
    Add the RCU docs to the core-api manual
    Documentation: RCU: Add TOC tree hooks
    Documentation: RCU: Rename txt files to rst
    Documentation: RCU: Convert RCU UP systems to reST
    Documentation: RCU: Convert RCU linked list to reST
    Documentation: RCU: Convert RCU basic concepts to reST
    docs: filesystems: Remove uneeded .rst extension on toctables
    scripts/sphinx-pre-install: fix out-of-tree build
    docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/
    Documentation: PGP: update for newer HW devices
    Documentation: Add section about CPU vulnerabilities for Spectre
    Documentation: platform: Delete x86-laptop-drivers.txt
    docs: Note that :c:func: should no longer be used
    ...

    Linus Torvalds
     
  • Pull block updates from Jens Axboe:
    "This is the main block updates for 5.3. Nothing earth shattering or
    major in here, just fixes, additions, and improvements all over the
    map. This contains:

    - Series of documentation fixes (Bart)

    - Optimization of the blk-mq ctx get/put (Bart)

    - null_blk removal race condition fix (Bob)

    - req/bio_op() cleanups (Chaitanya)

    - Series cleaning up the segment accounting, and request/bio mapping
    (Christoph)

    - Series cleaning up the page getting/putting for bios (Christoph)

    - block cgroup cleanups and moving it to where it is used (Christoph)

    - block cgroup fixes (Tejun)

    - Series of fixes and improvements to bcache, most notably a write
    deadlock fix (Coly)

    - blk-iolatency STS_AGAIN and accounting fixes (Dennis)

    - Series of improvements and fixes to BFQ (Douglas, Paolo)

    - debugfs_create() return value check removal for drbd (Greg)

    - Use struct_size(), where appropriate (Gustavo)

    - Two lighnvm fixes (Heiner, Geert)

    - MD fixes, including a read balance and corruption fix (Guoqing,
    Marcos, Xiao, Yufen)

    - block opal shadow mbr additions (Jonas, Revanth)

    - sbitmap compare-and-exhange improvemnts (Pavel)

    - Fix for potential bio->bi_size overflow (Ming)

    - NVMe pull requests:
    - improved PCIe suspent support (Keith Busch)
    - error injection support for the admin queue (Akinobu Mita)
    - Fibre Channel discovery improvements (James Smart)
    - tracing improvements including nvmetc tracing support (Minwoo Im)
    - misc fixes and cleanups (Anton Eidelman, Minwoo Im, Chaitanya
    Kulkarni)"

    - Various little fixes and improvements to drivers and core"

    * tag 'for-5.3/block-20190708' of git://git.kernel.dk/linux-block: (153 commits)
    blk-iolatency: fix STS_AGAIN handling
    block: nr_phys_segments needs to be zero for REQ_OP_WRITE_ZEROES
    blk-mq: simplify blk_mq_make_request()
    blk-mq: remove blk_mq_put_ctx()
    sbitmap: Replace cmpxchg with xchg
    block: fix .bi_size overflow
    block: sed-opal: check size of shadow mbr
    block: sed-opal: ioctl for writing to shadow mbr
    block: sed-opal: add ioctl for done-mark of shadow mbr
    block: never take page references for ITER_BVEC
    direct-io: use bio_release_pages in dio_bio_complete
    block_dev: use bio_release_pages in bio_unmap_user
    block_dev: use bio_release_pages in blkdev_bio_end_io
    iomap: use bio_release_pages in iomap_dio_bio_end_io
    block: use bio_release_pages in bio_map_user_iov
    block: use bio_release_pages in bio_unmap_user
    block: optionally mark pages dirty in bio_release_pages
    block: move the BIO_NO_PAGE_REF check into bio_release_pages
    block: skd_main.c: Remove call to memset after dma_alloc_coherent
    block: mtip32xx: Remove call to memset after dma_alloc_coherent
    ...

    Linus Torvalds
     

09 Jul, 2019

4 commits

  • The headers in include/ are globally used in the kernel source tree
    to provide common APIs. They are included from external modules, too.

    It will be useful to make as many headers self-contained as possible
    so that we do not have to rely on a specific include order.

    There are more than 4000 headers in include/. In my rough analysis,
    70% of them are already self-contained. With efforts, most of them
    can be self-contained.

    For now, we must exclude more than 1000 headers just because they
    cannot be compiled as standalone units. I added them to header-test-.
    The blacklist was mostly generated by a script, so the reason of the
    breakage should be checked later.

    Signed-off-by: Masahiro Yamada
    Tested-by: Jani Nikula
    Reviewed-by: Joel Fernandes (Google)

    Masahiro Yamada
     
  • Pull cgroup updates from Tejun Heo:
    "Documentation updates and the addition of cgroup_parse_float() which
    will be used by new controllers including blk-iocost"

    * 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    docs: cgroup-v1: convert docs to ReST and rename to *.rst
    cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS
    cgroup: add cgroup_parse_float()

    Linus Torvalds
     
  • Pull scheduler updates from Ingo Molnar:

    - Remove the unused per rq load array and all its infrastructure, by
    Dietmar Eggemann.

    - Add utilization clamping support by Patrick Bellasi. This is a
    refinement of the energy aware scheduling framework with support for
    boosting of interactive and capping of background workloads: to make
    sure critical GUI threads get maximum frequency ASAP, and to make
    sure background processing doesn't unnecessarily move to cpufreq
    governor to higher frequencies and less energy efficient CPU modes.

    - Add the bare minimum of tracepoints required for LISA EAS regression
    testing, by Qais Yousef - which allows automated testing of various
    power management features, including energy aware scheduling.

    - Restructure the former tsk_nr_cpus_allowed() facility that the -rt
    kernel used to modify the scheduler's CPU affinity logic such as
    migrate_disable() - introduce the task->cpus_ptr value instead of
    taking the address of &task->cpus_allowed directly - by Sebastian
    Andrzej Siewior.

    - Misc optimizations, fixes, cleanups and small enhancements - see the
    Git log for details.

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
    sched/uclamp: Add uclamp support to energy_compute()
    sched/uclamp: Add uclamp_util_with()
    sched/cpufreq, sched/uclamp: Add clamps for FAIR and RT tasks
    sched/uclamp: Set default clamps for RT tasks
    sched/uclamp: Reset uclamp values on RESET_ON_FORK
    sched/uclamp: Extend sched_setattr() to support utilization clamping
    sched/core: Allow sched_setattr() to use the current policy
    sched/uclamp: Add system default clamps
    sched/uclamp: Enforce last task's UCLAMP_MAX
    sched/uclamp: Add bucket local max tracking
    sched/uclamp: Add CPU's clamp buckets refcounting
    sched/fair: Rename weighted_cpuload() to cpu_runnable_load()
    sched/debug: Export the newly added tracepoints
    sched/debug: Add sched_overutilized tracepoint
    sched/debug: Add new tracepoint to track PELT at se level
    sched/debug: Add new tracepoints to track PELT at rq level
    sched/debug: Add a new sched_trace_*() helper functions
    sched/autogroup: Make autogroup_path() always available
    sched/wait: Deduplicate code with do-while
    sched/topology: Remove unused 'sd' parameter from arch_scale_cpu_capacity()
    ...

    Linus Torvalds
     
  • Pull locking updates from Ingo Molnar:
    "The main changes in this cycle are:

    - rwsem scalability improvements, phase #2, by Waiman Long, which are
    rather impressive:

    "On a 2-socket 40-core 80-thread Skylake system with 40 reader
    and writer locking threads, the min/mean/max locking operations
    done in a 5-second testing window before the patchset were:

    40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810
    40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255

    After the patchset, they became:

    40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741
    40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098"

    There's a lot of changes to the locking implementation that makes
    it similar to qrwlock, including owner handoff for more fair
    locking.

    Another microbenchmark shows how across the spectrum the
    improvements are:

    "With a locking microbenchmark running on 5.1 based kernel, the
    total locking rates (in kops/s) on a 2-socket Skylake system
    with equal numbers of readers and writers (mixed) before and
    after this patchset were:

    # of Threads Before Patch After Patch
    ------------ ------------ -----------
    2 2,618 4,193
    4 1,202 3,726
    8 802 3,622
    16 729 3,359
    32 319 2,826
    64 102 2,744"

    The changes are extensive and the patch-set has been through
    several iterations addressing various locking workloads. There
    might be more regressions, but unless they are pathological I
    believe we want to use this new implementation as the baseline
    going forward.

    - jump-label optimizations by Daniel Bristot de Oliveira: the primary
    motivation was to remove IPI disturbance of isolated RT-workload
    CPUs, which resulted in the implementation of batched jump-label
    updates. Beyond the improvement of the real-time characteristics
    kernel, in one test this patchset improved static key update
    overhead from 57 msecs to just 1.4 msecs - which is a nice speedup
    as well.

    - atomic64_t cross-arch type cleanups by Mark Rutland: over the last
    ~10 years of atomic64_t existence the various types used by the
    APIs only had to be self-consistent within each architecture -
    which means they became wildly inconsistent across architectures.
    Mark puts and end to this by reworking all the atomic64
    implementations to use 's64' as the base type for atomic64_t, and
    to ensure that this type is consistently used for parameters and
    return values in the API, avoiding further problems in this area.

    - A large set of small improvements to lockdep by Yuyang Du: type
    cleanups, output cleanups, function return type and othr cleanups
    all around the place.

    - A set of percpu ops cleanups and fixes by Peter Zijlstra.

    - Misc other changes - please see the Git log for more details"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits)
    locking/lockdep: increase size of counters for lockdep statistics
    locking/atomics: Use sed(1) instead of non-standard head(1) option
    locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
    x86/jump_label: Make tp_vec_nr static
    x86/percpu: Optimize raw_cpu_xchg()
    x86/percpu, sched/fair: Avoid local_clock()
    x86/percpu, x86/irq: Relax {set,get}_irq_regs()
    x86/percpu: Relax smp_processor_id()
    x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()
    locking/rwsem: Guard against making count negative
    locking/rwsem: Adaptive disabling of reader optimistic spinning
    locking/rwsem: Enable time-based spinning on reader-owned rwsem
    locking/rwsem: Make rwsem->owner an atomic_long_t
    locking/rwsem: Enable readers spinning on writer
    locking/rwsem: Clarify usage of owner's nonspinaable bit
    locking/rwsem: Wake up almost all readers in wait queue
    locking/rwsem: More optimal RT task handling of null owner
    locking/rwsem: Always release wait_lock before waking up tasks
    locking/rwsem: Implement lock handoff to prevent lock starvation
    locking/rwsem: Make rwsem_spin_on_owner() return owner state
    ...

    Linus Torvalds
     

08 Jul, 2019

2 commits

  • Multiple people have suggested compile-testing UAPI headers to ensure
    they can be really included from user-space. "make headers_check" is
    obviously not enough to catch bugs, and we often leak unresolved
    references to user-space.

    Use the new header-test-y syntax to implement it. Please note exported
    headers are compile-tested with a completely different set of compiler
    flags. The header search path is set to $(objtree)/usr/include since
    exported headers should not include unexported ones.

    We use -std=gnu89 for the kernel space since the kernel code highly
    depends on GNU extensions. On the other hand, UAPI headers should be
    written in more standardized C, so they are compiled with -std=c90.
    This will emit errors if C++ style comments, the keyword 'inline', etc.
    are used. Please use C style comments (/* ... */), '__inline__', etc.
    in UAPI headers.

    There is additional compiler requirement to enable this test because
    many of UAPI headers include , , ,
    etc. directly or indirectly. You cannot use kernel.org pre-built
    toolchains [1] since they lack .

    I reused CONFIG_CC_CAN_LINK to check the system header availability.
    The intention is slightly different, but a compiler that can link
    userspace programs provide system headers.

    For now, a lot of headers need to be excluded because they cannot
    be compiled standalone, but this is a good start point.

    [1] https://mirrors.edge.kernel.org/pub/tools/crosstool/index.html

    Signed-off-by: Masahiro Yamada
    Reviewed-by: Sam Ravnborg

    Masahiro Yamada
     
  • Currently, scripts/cc-can-link.sh is run just for BPFILTER_UMH, but
    defining CC_CAN_LINK will be useful in other places.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     

05 Jul, 2019

3 commits

  • No point having two call sites (earlier in init_rootfs() from
    mnt_init() in case we are going to use shmem-style rootfs,
    later from do_basic_setup() unconditionally), along with the
    logics in shmem_init() itself to make the second call a no-op...

    Signed-off-by: Al Viro

    Al Viro
     
  • init_mount_tree() can get to rootfs_fs_type directly and that simplifies
    a lot of things. We don't need to register it, we don't need to look
    it up *and* we don't need to bother with preventing subsequent userland
    mounts. That's the way we should've done that from the very beginning.

    There is a user-visible change, namely the disappearance of "rootfs"
    from /proc/filesystems. Note that it's been unmountable all along
    and it didn't show up in /proc/mounts; however, it *is* a user-visible
    change and theoretically some script might've been using its presence
    in /proc/filesystems to tell 2.4.11+ from earlier kernels.

    *IF* any complaints about behaviour change do show up, we could fake
    it in /proc/filesystems. I very much doubt we'll have to, though.

    Signed-off-by: Al Viro

    Al Viro
     
  • the only thing done by the latter is making ramfs visible
    to mount(2); we don't need it there - rootfs is separate
    and, in fact, made visible to mount(2) in the same init_rootfs().

    Signed-off-by: Al Viro

    Al Viro
     

29 Jun, 2019

1 commit

  • With gcc-4.6.3:

    WARNING: vmlinux.o(.text.unlikely+0x140): Section mismatch in reference from the function populate_initrd_image() to the variable .init.ramfs.info:__initramfs_size
    The function populate_initrd_image() references
    the variable __init __initramfs_size.
    This is often because populate_initrd_image lacks a __init
    annotation or the annotation of __initramfs_size is wrong.

    WARNING: vmlinux.o(.text.unlikely+0x14c): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:unpack_to_rootfs()
    The function populate_initrd_image() references
    the function __init unpack_to_rootfs().
    This is often because populate_initrd_image lacks a __init
    annotation or the annotation of unpack_to_rootfs is wrong.

    WARNING: vmlinux.o(.text.unlikely+0x198): Section mismatch in reference from the function populate_initrd_image() to the function .init.text:xwrite()
    The function populate_initrd_image() references
    the function __init xwrite().
    This is often because populate_initrd_image lacks a __init
    annotation or the annotation of xwrite is wrong.

    Indeed, if the compiler decides not to inline populate_initrd_image(), a
    warning is generated.

    Fix this by adding the missing __init annotations.

    Link: http://lkml.kernel.org/r/20190617074340.12779-1-geert@linux-m68k.org
    Fixes: 7c184ecd262fe64f ("initramfs: factor out a helper to populate the initrd image")
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     

25 Jun, 2019

1 commit

  • Utilization clamping allows to clamp the CPU's utilization within a
    [util_min, util_max] range, depending on the set of RUNNABLE tasks on
    that CPU. Each task references two "clamp buckets" defining its minimum
    and maximum (util_{min,max}) utilization "clamp values". A CPU's clamp
    bucket is active if there is at least one RUNNABLE tasks enqueued on
    that CPU and refcounting that bucket.

    When a task is {en,de}queued {on,from} a rq, the set of active clamp
    buckets on that CPU can change. If the set of active clamp buckets
    changes for a CPU a new "aggregated" clamp value is computed for that
    CPU. This is because each clamp bucket enforces a different utilization
    clamp value.

    Clamp values are always MAX aggregated for both util_min and util_max.
    This ensures that no task can affect the performance of other
    co-scheduled tasks which are more boosted (i.e. with higher util_min
    clamp) or less capped (i.e. with higher util_max clamp).

    A task has:
    task_struct::uclamp[clamp_id]::bucket_id
    to track the "bucket index" of the CPU's clamp bucket it refcounts while
    enqueued, for each clamp index (clamp_id).

    A runqueue has:
    rq::uclamp[clamp_id]::bucket[bucket_id].tasks
    to track how many RUNNABLE tasks on that CPU refcount each
    clamp bucket (bucket_id) of a clamp index (clamp_id).
    It also has a:
    rq::uclamp[clamp_id]::bucket[bucket_id].value
    to track the clamp value of each clamp bucket (bucket_id) of a clamp
    index (clamp_id).

    The rq::uclamp::bucket[clamp_id][] array is scanned every time it's
    needed to find a new MAX aggregated clamp value for a clamp_id. This
    operation is required only when it's dequeued the last task of a clamp
    bucket tracking the current MAX aggregated clamp value. In this case,
    the CPU is either entering IDLE or going to schedule a less boosted or
    more clamped task.
    The expected number of different clamp values configured at build time
    is small enough to fit the full unordered array into a single cache
    line, for configurations of up to 7 buckets.

    Add to struct rq the basic data structures required to refcount the
    number of RUNNABLE tasks for each clamp bucket. Add also the max
    aggregation required to update the rq's clamp value at each
    enqueue/dequeue event.

    Use a simple linear mapping of clamp values into clamp buckets.
    Pre-compute and cache bucket_id to avoid integer divisions at
    enqueue/dequeue time.

    Signed-off-by: Patrick Bellasi
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alessio Balsini
    Cc: Dietmar Eggemann
    Cc: Joel Fernandes
    Cc: Juri Lelli
    Cc: Linus Torvalds
    Cc: Morten Rasmussen
    Cc: Paul Turner
    Cc: Peter Zijlstra
    Cc: Quentin Perret
    Cc: Rafael J . Wysocki
    Cc: Steve Muckle
    Cc: Suren Baghdasaryan
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Todd Kjos
    Cc: Vincent Guittot
    Cc: Viresh Kumar
    Link: https://lkml.kernel.org/r/20190621084217.8167-2-patrick.bellasi@arm.com
    Signed-off-by: Ingo Molnar

    Patrick Bellasi
     

21 Jun, 2019

1 commit


17 Jun, 2019

2 commits


15 Jun, 2019

3 commits

  • Sometimes it's useful to be able to explicitly ensure certain headers
    remain self-contained, i.e. that they are compilable as standalone
    units, by including and/or forward declaring everything they depend on.

    Add special target header-test-y where individual Makefiles can add
    headers to be tested if CONFIG_HEADER_TEST is enabled. This will
    generate a dummy C file per header that gets built as part of extra-y.

    Signed-off-by: Jani Nikula
    Reviewed-by: Sam Ravnborg
    Signed-off-by: Masahiro Yamada

    Jani Nikula
     
  • In order to prepare to add them to the Kernel API book,
    convert the files to ReST format.

    The conversion is actually:
    - add blank lines and identation in order to identify paragraphs;
    - fix tables markups;
    - add some lists markups;
    - mark literal blocks;
    - adjust title markups.

    At its new index.rst, let's add a :orphan: while this is not linked to
    the main index.rst file, in order to avoid build warnings.

    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     
  • Convert the cgroup-v1 files to ReST format, in order to
    allow a later addition to the admin-guide.

    The conversion is actually:
    - add blank lines and identation in order to identify paragraphs;
    - fix tables markups;
    - add some lists markups;
    - mark literal blocks;
    - adjust title markups.

    At its new index.rst, let's add a :orphan: while this is not linked to
    the main index.rst file, in order to avoid build warnings.

    Signed-off-by: Mauro Carvalho Chehab
    Acked-by: Tejun Heo
    Signed-off-by: Tejun Heo

    Mauro Carvalho Chehab
     

09 Jun, 2019

1 commit

  • Pull char/misc driver fixes from Greg KH:
    "Here are some small char and misc driver fixes for 5.2-rc4 to resolve
    a number of reported issues.

    The most "notable" one here is the kernel headers in proc^Wsysfs
    fixes. Those changes move the header file info into sysfs and fixes
    the build issues that you reported.

    Other than that, a bunch of small habanalabs driver fixes, some fpga
    driver fixes, and a few other tiny driver fixes.

    All of these have been in linux-next for a while with no reported
    issues"

    * tag 'char-misc-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
    habanalabs: Read upper bits of trace buffer from RWPHI
    habanalabs: Fix virtual address access via debugfs for 2MB pages
    fpga: zynqmp-fpga: Correctly handle error pointer
    habanalabs: fix bug in checking huge page optimization
    habanalabs: Avoid using a non-initialized MMU cache mutex
    habanalabs: fix debugfs code
    uapi/habanalabs: add opcode for enable/disable device debug mode
    habanalabs: halt debug engines on user process close
    test_firmware: Use correct snprintf() limit
    genwqe: Prevent an integer overflow in the ioctl
    parport: Fix mem leak in parport_register_dev_model
    fpga: dfl: expand minor range when registering chrdev region
    fpga: dfl: Add lockdep classes for pdata->lock
    fpga: dfl: afu: Pass the correct device to dma_mapping_error()
    fpga: stratix10-soc: fix use-after-free on s10_init()
    w1: ds2408: Fix typo after 49695ac46861 (reset on output_write retry with readback)
    kheaders: Do not regenerate archive if config is not changed
    kheaders: Move from proc to sysfs
    lkdtm/bugs: Adjust recursion test to avoid elision
    lkdtm/usercopy: Moves the KERNEL_DS test to non-canonical

    Linus Torvalds
     

03 Jun, 2019

3 commits

  • Chain keys are computed using Jenkins hash function, which needs an initial
    hash to start with. Dedicate a macro to make this clear and configurable. A
    later patch changes this initial chain key.

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-9-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • Despite that there is a lockdep_init_task() which does nothing, lockdep
    initiates tasks by assigning lockdep fields and does so inconsistently. Fix
    this by using lockdep_init_task().

    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bvanassche@acm.org
    Cc: frederic@kernel.org
    Cc: ming.lei@redhat.com
    Cc: will.deacon@arm.com
    Link: https://lkml.kernel.org/r/20190506081939.74287-8-duyuyang@gmail.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • In commit:

    4b53a3412d66 ("sched/core: Remove the tsk_nr_cpus_allowed() wrapper")

    the tsk_nr_cpus_allowed() wrapper was removed. There was not
    much difference in !RT but in RT we used this to implement
    migrate_disable(). Within a migrate_disable() section the CPU mask is
    restricted to single CPU while the "normal" CPU mask remains untouched.

    As an alternative implementation Ingo suggested to use:

    struct task_struct {
    const cpumask_t *cpus_ptr;
    cpumask_t cpus_mask;
    };
    with
    t->cpus_ptr = &t->cpus_mask;

    In -RT we then can switch the cpus_ptr to:

    t->cpus_ptr = &cpumask_of(task_cpu(p));

    in a migration disabled region. The rules are simple:

    - Code that 'uses' ->cpus_allowed would use the pointer.
    - Code that 'modifies' ->cpus_allowed would use the direct mask.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Thomas Gleixner
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Link: https://lkml.kernel.org/r/20190423142636.14347-1-bigeasy@linutronix.de
    Signed-off-by: Ingo Molnar

    Sebastian Andrzej Siewior
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation version 2 of the license this program
    is distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program if not write to the free
    software foundation inc 59 temple place suite 330 boston ma 02111
    1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 83 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Richard Fontana
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070034.021731668@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

25 May, 2019

1 commit

  • The kheaders archive consisting of the kernel headers used for compiling
    bpf programs is in /proc. However there is concern that moving it here
    will make it permanent. Let us move it to /sys/kernel as discussed [1].

    [1] https://lore.kernel.org/patchwork/patch/1067310/#1265969

    Suggested-by: Steven Rostedt
    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Greg Kroah-Hartman

    Joel Fernandes (Google)
     

21 May, 2019

2 commits

  • Add SPDX license identifiers to all Make/Kconfig files which:

    - Have no license information of any form

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

19 May, 2019

1 commit

  • Since commit 54c7a8916a88 ("initramfs: free initrd memory if opening
    /initrd.image fails"), the kernel has unconditionally attempted to free
    the initrd even if it doesn't exist.

    In the non-existent case this causes a boot-time splat if
    CONFIG_DEBUG_VIRTUAL is enabled due to a call to virt_to_phys() with a
    NULL address.

    Instead we should check that the initrd actually exists and only attempt
    to free it if it does.

    Link: http://lkml.kernel.org/r/20190516143125.48948-1-steven.price@arm.com
    Fixes: 54c7a8916a88 ("initramfs: free initrd memory if opening /initrd.image fails")
    Signed-off-by: Steven Price
    Reported-by: Mark Rutland
    Tested-by: Mark Rutland
    Reviewed-by: Mike Rapoport
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steven Price
     

15 May, 2019

5 commits

  • Patch series "mm: Randomize free memory", v10.

    This patch (of 3):

    Randomization of the page allocator improves the average utilization of
    a direct-mapped memory-side-cache. Memory side caching is a platform
    capability that Linux has been previously exposed to in HPC
    (high-performance computing) environments on specialty platforms. In
    that instance it was a smaller pool of high-bandwidth-memory relative to
    higher-capacity / lower-bandwidth DRAM. Now, this capability is going
    to be found on general purpose server platforms where DRAM is a cache in
    front of higher latency persistent memory [1].

    Robert offered an explanation of the state of the art of Linux
    interactions with memory-side-caches [2], and I copy it here:

    It's been a problem in the HPC space:
    http://www.nersc.gov/research-and-development/knl-cache-mode-performance-coe/

    A kernel module called zonesort is available to try to help:
    https://software.intel.com/en-us/articles/xeon-phi-software

    and this abandoned patch series proposed that for the kernel:
    https://lkml.kernel.org/r/20170823100205.17311-1-lukasz.daniluk@intel.com

    Dan's patch series doesn't attempt to ensure buffers won't conflict, but
    also reduces the chance that the buffers will. This will make performance
    more consistent, albeit slower than "optimal" (which is near impossible
    to attain in a general-purpose kernel). That's better than forcing
    users to deploy remedies like:
    "To eliminate this gradual degradation, we have added a Stream
    measurement to the Node Health Check that follows each job;
    nodes are rebooted whenever their measured memory bandwidth
    falls below 300 GB/s."

    A replacement for zonesort was merged upstream in commit cc9aec03e58f
    ("x86/numa_emulation: Introduce uniform split capability"). With this
    numa_emulation capability, memory can be split into cache sized
    ("near-memory" sized) numa nodes. A bind operation to such a node, and
    disabling workloads on other nodes, enables full cache performance.
    However, once the workload exceeds the cache size then cache conflicts
    are unavoidable. While HPC environments might be able to tolerate
    time-scheduling of cache sized workloads, for general purpose server
    platforms, the oversubscribed cache case will be the common case.

    The worst case scenario is that a server system owner benchmarks a
    workload at boot with an un-contended cache only to see that performance
    degrade over time, even below the average cache performance due to
    excessive conflicts. Randomization clips the peaks and fills in the
    valleys of cache utilization to yield steady average performance.

    Here are some performance impact details of the patches:

    1/ An Intel internal synthetic memory bandwidth measurement tool, saw a
    3X speedup in a contrived case that tries to force cache conflicts.
    The contrived cased used the numa_emulation capability to force an
    instance of the benchmark to be run in two of the near-memory sized
    numa nodes. If both instances were placed on the same emulated they
    would fit and cause zero conflicts. While on separate emulated nodes
    without randomization they underutilized the cache and conflicted
    unnecessarily due to the in-order allocation per node.

    2/ A well known Java server application benchmark was run with a heap
    size that exceeded cache size by 3X. The cache conflict rate was 8%
    for the first run and degraded to 21% after page allocator aging. With
    randomization enabled the rate levelled out at 11%.

    3/ A MongoDB workload did not observe measurable difference in
    cache-conflict rates, but the overall throughput dropped by 7% with
    randomization in one case.

    4/ Mel Gorman ran his suite of performance workloads with randomization
    enabled on platforms without a memory-side-cache and saw a mix of some
    improvements and some losses [3].

    While there is potentially significant improvement for applications that
    depend on low latency access across a wide working-set, the performance
    may be negligible to negative for other workloads. For this reason the
    shuffle capability defaults to off unless a direct-mapped
    memory-side-cache is detected. Even then, the page_alloc.shuffle=0
    parameter can be specified to disable the randomization on those systems.

    Outside of memory-side-cache utilization concerns there is potentially
    security benefit from randomization. Some data exfiltration and
    return-oriented-programming attacks rely on the ability to infer the
    location of sensitive data objects. The kernel page allocator, especially
    early in system boot, has predictable first-in-first out behavior for
    physical pages. Pages are freed in physical address order when first
    onlined.

    Quoting Kees:
    "While we already have a base-address randomization
    (CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and
    memory layouts would certainly be using the predictability of
    allocation ordering (i.e. for attacks where the base address isn't
    important: only the relative positions between allocated memory).
    This is common in lots of heap-style attacks. They try to gain
    control over ordering by spraying allocations, etc.

    I'd really like to see this because it gives us something similar
    to CONFIG_SLAB_FREELIST_RANDOM but for the page allocator."

    While SLAB_FREELIST_RANDOM reduces the predictability of some local slab
    caches it leaves vast bulk of memory to be predictably in order allocated.
    However, it should be noted, the concrete security benefits are hard to
    quantify, and no known CVE is mitigated by this randomization.

    Introduce shuffle_free_memory(), and its helper shuffle_zone(), to perform
    a Fisher-Yates shuffle of the page allocator 'free_area' lists when they
    are initially populated with free memory at boot and at hotplug time. Do
    this based on either the presence of a page_alloc.shuffle=Y command line
    parameter, or autodetection of a memory-side-cache (to be added in a
    follow-on patch).

    The shuffling is done in terms of CONFIG_SHUFFLE_PAGE_ORDER sized free
    pages where the default CONFIG_SHUFFLE_PAGE_ORDER is MAX_ORDER-1 i.e. 10,
    4MB this trades off randomization granularity for time spent shuffling.
    MAX_ORDER-1 was chosen to be minimally invasive to the page allocator
    while still showing memory-side cache behavior improvements, and the
    expectation that the security implications of finer granularity
    randomization is mitigated by CONFIG_SLAB_FREELIST_RANDOM. The
    performance impact of the shuffling appears to be in the noise compared to
    other memory initialization work.

    This initial randomization can be undone over time so a follow-on patch is
    introduced to inject entropy on page free decisions. It is reasonable to
    ask if the page free entropy is sufficient, but it is not enough due to
    the in-order initial freeing of pages. At the start of that process
    putting page1 in front or behind page0 still keeps them close together,
    page2 is still near page1 and has a high chance of being adjacent. As
    more pages are added ordering diversity improves, but there is still high
    page locality for the low address pages and this leads to no significant
    impact to the cache conflict rate.

    [1]: https://itpeernetwork.intel.com/intel-optane-dc-persistent-memory-operating-modes/
    [2]: https://lkml.kernel.org/r/AT5PR8401MB1169D656C8B5E121752FC0F8AB120@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM
    [3]: https://lkml.org/lkml/2018/10/12/309

    [dan.j.williams@intel.com: fix shuffle enable]
    Link: http://lkml.kernel.org/r/154943713038.3858443.4125180191382062871.stgit@dwillia2-desk3.amr.corp.intel.com
    [cai@lca.pw: fix SHUFFLE_PAGE_ALLOCATOR help texts]
    Link: http://lkml.kernel.org/r/20190425201300.75650-1-cai@lca.pw
    Link: http://lkml.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Dan Williams
    Signed-off-by: Qian Cai
    Reviewed-by: Kees Cook
    Acked-by: Michal Hocko
    Cc: Dave Hansen
    Cc: Keith Busch
    Cc: Robert Elliott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Williams
     
  • Various architectures including x86 poison the freed init memory. Do the
    same in the generic free_initmem implementation and switch sparc32
    architecture that is identical to the generic code over to it now.

    Link: http://lkml.kernel.org/r/1550515285-17446-4-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Palmer Dabbelt
    Cc: Richard Kuo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Patch series "provide a generic free_initmem implementation", v2.

    Many architectures implement free_initmem() in exactly the same or very
    similar way: they wrap the call to free_initmem_default() with sometimes
    different 'poison' parameter.

    These patches switch those architectures to use a generic implementation
    that does free_initmem_default(POISON_FREE_INITMEM).

    This was inspired by Christoph's patches for free_initrd_mem [1] and I
    shamelessly copied changelog entries from his patches :)

    [1] https://lore.kernel.org/lkml/20190213174621.29297-1-hch@lst.de/

    This patch (of 2):

    For most architectures free_initmem just a wrapper for the same
    free_initmem_default(-1) call. Provide that as a generic implementation
    marked __weak.

    Link: http://lkml.kernel.org/r/1550515285-17446-2-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Palmer Dabbelt
    Cc: Richard Kuo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • Various architectures including x86 poison the freed initrd memory. Do
    the same in the generic free_initrd_mem implementation and switch a few
    more architectures that are identical to the generic code over to it now.

    Link: http://lkml.kernel.org/r/20190213174621.29297-9-hch@lst.de
    Signed-off-by: Christoph Hellwig
    Acked-by: Mike Rapoport
    Cc: Catalin Marinas [arm64]
    Cc: Geert Uytterhoeven [m68k]
    Cc: Steven Price
    Cc: Alexander Viro
    Cc: Guan Xuetao
    Cc: Russell King
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • For most architectures free_initrd_mem just expands to the same
    free_reserved_area call. Provide that as a generic implementation marked
    __weak.

    Link: http://lkml.kernel.org/r/20190213174621.29297-8-hch@lst.de
    Signed-off-by: Christoph Hellwig
    Acked-by: Geert Uytterhoeven [m68k]
    Acked-by: Mike Rapoport
    Cc: Catalin Marinas [arm64]
    Cc: Steven Price
    Cc: Alexander Viro
    Cc: Guan Xuetao
    Cc: Russell King
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig