17 May, 2019

1 commit

  • Pull nommu generic uaccess updates from Arnd Bergmann:
    "asm-generic: kill and improve nommu generic uaccess helpers

    Christoph Hellwig writes:

    This is a series doing two somewhat interwinded things. It improves
    the asm-generic nommu uaccess helper to optionally be entirely
    generic and not require any arch helpers for the actual uaccess.
    For the generic uaccess.h to actually be generically useful I also
    had to kill off the mess we made of , which really
    shouldn't exist on most architectures"

    * tag 'asm-generic-nommu' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
    asm-generic: optimize generic uaccess for 8-byte loads and stores
    asm-generic: provide entirely generic nommu uaccess
    arch: mostly remove
    asm-generic: don't include from

    Linus Torvalds
     

15 May, 2019

5 commits

  • Merge more updates from Andrew Morton:

    - a couple of hotfixes

    - almost all of the rest of MM

    - lib/ updates

    - binfmt_elf updates

    - autofs updates

    - quite a lot of misc fixes and updates
    - reiserfs, fatfs
    - signals
    - exec
    - cpumask
    - rapidio
    - sysctl
    - pids
    - eventfd
    - gcov
    - panic
    - pps

    - gdb script updates

    - ipc updates

    * emailed patches from Andrew Morton : (126 commits)
    mm: memcontrol: fix NUMA round-robin reclaim at intermediate level
    mm: memcontrol: fix recursive statistics correctness & scalabilty
    mm: memcontrol: move stat/event counting functions out-of-line
    mm: memcontrol: make cgroup stats and events query API explicitly local
    drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl
    drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl
    mm, memcg: rename ambiguously named memory.stat counters and functions
    arch: remove and
    treewide: replace #include with #include
    fs/block_dev.c: Remove duplicate header
    fs/cachefiles/namei.c: remove duplicate header
    include/linux/sched/signal.h: replace `tsk' with `task'
    fs/coda/psdev.c: remove duplicate header
    ipc: do cyclic id allocation for the ipc object.
    ipc: conserve sequence numbers in ipcmni_extend mode
    ipc: allow boot time extension of IPCMNI from 32k to 16M
    ipc/mqueue: optimize msg_get()
    ipc/mqueue: remove redundant wq task assignment
    ipc: prevent lockup on alloc_msg and free_msg
    scripts/gdb: print cached rate in lx-clk-summary
    ...

    Linus Torvalds
     
  • Now that all instances of #include have been replaced with
    #include , we can remove these.

    Link: http://lkml.kernel.org/r/1553267665-27228-2-git-send-email-yamada.masahiro@socionext.com
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     
  • The "WITH Linux-syscall-note" should be added to headers exported to the
    user-space.

    Some kernel-space headers have "WITH Linux-syscall-note", which seems a
    mistake.

    [1] arch/x86/include/asm/hyperv-tlfs.h

    Commit 5a4858032217 ("x86/hyper-v: move hyperv.h out of uapi") moved
    this file out of uapi, but missed to update the SPDX License tag.

    [2] include/asm-generic/shmparam.h

    Commit 76ce2a80a28e ("Rename include/{uapi => }/asm-generic/shmparam.h
    really") moved this file out of uapi, but missed to update the SPDX
    License tag.

    [3] include/linux/qcom-geni-se.h

    Commit eddac5af0654 ("soc: qcom: Add GENI based QUP Wrapper driver")
    added this file, but I do not see a good reason why its license tag must
    include "WITH Linux-syscall-note".

    Link: http://lkml.kernel.org/r/1554196104-3522-1-git-send-email-yamada.masahiro@socionext.com
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     
  • Pull modules updates from Jessica Yu:

    - Use a separate table to store symbol types instead of hijacking
    fields in struct Elf_Sym

    - Trivial code cleanups

    * tag 'modules-for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
    module: add stubs for within_module functions
    kallsyms: store type information in its own array
    vmlinux.lds.h: drop unused __vermagic

    Linus Torvalds
     
  • On systems without CONTIG_ALLOC activated but that support gigantic pages,
    boottime reserved gigantic pages can not be freed at all. This patch
    simply enables the possibility to hand back those pages to memory
    allocator.

    Link: http://lkml.kernel.org/r/20190327063626.18421-5-alex@ghiti.fr
    Signed-off-by: Alexandre Ghiti
    Acked-by: David S. Miller [sparc]
    Reviewed-by: Mike Kravetz
    Cc: Andy Lutomirsky
    Cc: Aneesh Kumar K.V
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Catalin Marinas
    Cc: Dave Hansen
    Cc: Heiko Carstens
    Cc: "H . Peter Anvin"
    Cc: Ingo Molnar
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Rich Felker
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexandre Ghiti
     

09 May, 2019

1 commit

  • Pull Kbuild updates from Masahiro Yamada:

    - allow users to invoke 'make' out of the source tree

    - refactor scripts/mkmakefile

    - deprecate KBUILD_SRC, which was used to track the source tree
    location for O= build.

    - fix recordmcount.pl in case objdump output is localized

    - turn unresolved symbols in external modules to errors from warnings
    by default; pass KBUILD_MODPOST_WARN=1 to get them back to warnings

    - generate modules.builtin.modinfo to collect .modinfo data from
    built-in modules

    - misc Makefile cleanups

    * tag 'kbuild-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
    .gitignore: add more all*.config patterns
    moduleparam: Save information about built-in modules in separate file
    Remove MODULE_ALIAS() calls that take undefined macro
    .gitignore: add leading and trailing slashes to generated directories
    scripts/tags.sh: fix direct execution of scripts/tags.sh
    scripts: override locale from environment when running recordmcount.pl
    samples: kobject: allow CONFIG_SAMPLE_KOBJECT to become y
    samples: seccomp: turn CONFIG_SAMPLE_SECCOMP into a bool option
    kbuild: move Documentation to vmlinux-alldirs
    kbuild: move samples/ to KBUILD_VMLINUX_OBJS
    modpost: make KBUILD_MODPOST_WARN also configurable for external modules
    kbuild: check arch/$(SRCARCH)/include/generated before out-of-tree build
    kbuild: remove unneeded dependency for include/config/kernel.release
    memory: squash drivers/memory/Makefile.asm-offsets
    kbuild: use $(srctree) instead of KBUILD_SRC to check out-of-tree build
    kbuild: mkmakefile: generate a simple wrapper of top Makefile
    kbuild: mkmakefile: do not check the generated Makefile marker
    kbuild: allow Kbuild to start from any directory
    kbuild: pass $(MAKECMDGOALS) to sub-make as is
    kbuild: fix warning "overriding recipe for target 'Makefile'"
    ...

    Linus Torvalds
     

08 May, 2019

1 commit

  • Pull audit updates from Paul Moore:
    "We've got a reasonably broad set of audit patches for the v5.2 merge
    window, the highlights are below:

    - The biggest change, and the source of all the arch/* changes, is
    the patchset from Dmitry to help enable some of the work he is
    doing around PTRACE_GET_SYSCALL_INFO.

    To be honest, including this in the audit tree is a bit of a
    stretch, but it does help move audit a little further along towards
    proper syscall auditing for all arches, and everyone else seemed to
    agree that audit was a "good" spot for this to land (or maybe they
    just didn't want to merge it? dunno.).

    - We can now audit time/NTP adjustments.

    - We continue the work to connect associated audit records into a
    single event"

    * tag 'audit-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: (21 commits)
    audit: fix a memory leak bug
    ntp: Audit NTP parameters adjustment
    timekeeping: Audit clock adjustments
    audit: purge unnecessary list_empty calls
    audit: link integrity evm_write_xattrs record to syscall event
    syscall_get_arch: add "struct task_struct *" argument
    unicore32: define syscall_get_arch()
    Move EM_UNICORE to uapi/linux/elf-em.h
    nios2: define syscall_get_arch()
    nds32: define syscall_get_arch()
    Move EM_NDS32 to uapi/linux/elf-em.h
    m68k: define syscall_get_arch()
    hexagon: define syscall_get_arch()
    Move EM_HEXAGON to uapi/linux/elf-em.h
    h8300: define syscall_get_arch()
    c6x: define syscall_get_arch()
    arc: define syscall_get_arch()
    Move EM_ARCOMPACT and EM_ARCV2 to uapi/linux/elf-em.h
    audit: Make audit_log_cap and audit_copy_inode static
    audit: connect LOGIN record to its syscall record
    ...

    Linus Torvalds
     

07 May, 2019

7 commits

  • Problem:

    When a kernel module is compiled as a separate module, some important
    information about the kernel module is available via .modinfo section of
    the module. In contrast, when the kernel module is compiled into the
    kernel, that information is not available.

    Information about built-in modules is necessary in the following cases:

    1. When it is necessary to find out what additional parameters can be
    passed to the kernel at boot time.

    2. When you need to know which module names and their aliases are in
    the kernel. This is very useful for creating an initrd image.

    Proposal:

    The proposed patch does not remove .modinfo section with module
    information from the vmlinux at the build time and saves it into a
    separate file after kernel linking. So, the kernel does not increase in
    size and no additional information remains in it. Information is stored
    in the same format as in the separate modules (null-terminated string
    array). Because the .modinfo section is already exported with a separate
    modules, we are not creating a new API.

    It can be easily read in the userspace:

    $ tr '\0' '\n' < modules.builtin.modinfo
    ext4.softdep=pre: crc32c
    ext4.license=GPL
    ext4.description=Fourth Extended Filesystem
    ext4.author=Remy Card, Stephen Tweedie, Andrew Morton, Andreas Dilger, Theodore Ts'o and others
    ext4.alias=fs-ext4
    ext4.alias=ext3
    ext4.alias=fs-ext3
    ext4.alias=ext2
    ext4.alias=fs-ext2
    md_mod.alias=block-major-9-*
    md_mod.alias=md
    md_mod.description=MD RAID framework
    md_mod.license=GPL
    md_mod.parmtype=create_on_open:bool
    md_mod.parmtype=start_dirty_degraded:int
    ...

    Co-Developed-by: Gleb Fotengauer-Malinovskiy
    Signed-off-by: Gleb Fotengauer-Malinovskiy
    Signed-off-by: Alexey Gladkov
    Acked-by: Jessica Yu
    Signed-off-by: Masahiro Yamada

    Alexey Gladkov
     
  • Pull arm64 updates from Will Deacon:
    "Mostly just incremental improvements here:

    - Introduce AT_HWCAP2 for advertising CPU features to userspace

    - Expose SVE2 availability to userspace

    - Support for "data cache clean to point of deep persistence" (DC PODP)

    - Honour "mitigations=off" on the cmdline and advertise status via
    sysfs

    - CPU timer erratum workaround (Neoverse-N1 #1188873)

    - Introduce perf PMU driver for the SMMUv3 performance counters

    - Add config option to disable the kuser helpers page for AArch32 tasks

    - Futex modifications to ensure liveness under contention

    - Rework debug exception handling to seperate kernel and user
    handlers

    - Non-critical fixes and cleanup"

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (92 commits)
    Documentation: Add ARM64 to kernel-parameters.rst
    arm64/speculation: Support 'mitigations=' cmdline option
    arm64: ssbs: Don't treat CPUs with SSBS as unaffected by SSB
    arm64: enable generic CPU vulnerabilites support
    arm64: add sysfs vulnerability show for speculative store bypass
    arm64: Fix size of __early_cpu_boot_status
    clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters
    clocksource/arm_arch_timer: Remove use of workaround static key
    clocksource/arm_arch_timer: Drop use of static key in arch_timer_reg_read_stable
    clocksource/arm_arch_timer: Direcly assign set_next_event workaround
    arm64: Use arch_timer_read_counter instead of arch_counter_get_cntvct
    watchdog/sbsa: Use arch_timer_read_counter instead of arch_counter_get_cntvct
    ARM: vdso: Remove dependency with the arch_timer driver internals
    arm64: Apply ARM64_ERRATUM_1188873 to Neoverse-N1
    arm64: Add part number for Neoverse N1
    arm64: Make ARM64_ERRATUM_1188873 depend on COMPAT
    arm64: Restrict ARM64_ERRATUM_1188873 mitigation to AArch32
    arm64: mm: Remove pte_unmap_nested()
    arm64: Fix compiler warning from pte_unmap() with -Wunused-but-set-variable
    arm64: compat: Reduce address limit for 64K pages
    ...

    Linus Torvalds
     
  • Pull mmiowb removal from Will Deacon:
    "Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb())

    Remove mmiowb() from the kernel memory barrier API and instead, for
    architectures that need it, hide the barrier inside spin_unlock() when
    MMIO has been performed inside the critical section.

    The only relatively recent changes have been addressing review
    comments on the documentation, which is in a much better shape thanks
    to the efforts of Ben and Ingo.

    I was initially planning to split this into two pull requests so that
    you could run the coccinelle script yourself, however it's been plain
    sailing in linux-next so I've just included the whole lot here to keep
    things simple"

    * tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (23 commits)
    docs/memory-barriers.txt: Update I/O section to be clearer about CPU vs thread
    docs/memory-barriers.txt: Fix style, spacing and grammar in I/O section
    arch: Remove dummy mmiowb() definitions from arch code
    net/ethernet/silan/sc92031: Remove stale comment about mmiowb()
    i40iw: Redefine i40iw_mmiowb() to do nothing
    scsi/qla1280: Remove stale comment about mmiowb()
    drivers: Remove explicit invocations of mmiowb()
    drivers: Remove useless trailing comments from mmiowb() invocations
    Documentation: Kill all references to mmiowb()
    riscv/mmiowb: Hook up mmwiob() implementation to asm-generic code
    powerpc/mmiowb: Hook up mmwiob() implementation to asm-generic code
    ia64/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
    mips/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
    sh/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
    m68k/io: Remove useless definition of mmiowb()
    nds32/io: Remove useless definition of mmiowb()
    x86/io: Remove useless definition of mmiowb()
    arm64/io: Remove useless definition of mmiowb()
    ARM/io: Remove useless definition of mmiowb()
    mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors
    ...

    Linus Torvalds
     
  • Pull s390 updates from Martin Schwidefsky:

    - Support for kernel address space layout randomization

    - Add support for kernel image signature verification

    - Convert s390 to the generic get_user_pages_fast code

    - Convert s390 to the stack unwind API analog to x86

    - Add support for CPU directed interrupts for PCI devices

    - Provide support for MIO instructions to the PCI base layer, this will
    allow the use of direct PCI mappings in user space code

    - Add the basic KVM guest ultravisor interface for protected VMs

    - Add AT_HWCAP bits for several new hardware capabilities

    - Update the CPU measurement facility counter definitions to SVN 6

    - Arnds cleanup patches for his quest to get LLVM compiles working

    - A vfio-ccw update with bug fixes and support for halt and clear

    - Improvements for the hardware TRNG code

    - Another round of cleanup for the QDIO layer

    - Numerous cleanups and bug fixes

    * tag 's390-5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (98 commits)
    s390/vdso: drop unnecessary cc-ldoption
    s390: fix clang -Wpointer-sign warnigns in boot code
    s390: drop CONFIG_VIRT_TO_BUS
    s390: boot, purgatory: pass $(CLANG_FLAGS) where needed
    s390: only build for new CPUs with clang
    s390: simplify disabled_wait
    s390/ftrace: use HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
    s390/unwind: introduce stack unwind API
    s390/opcodes: add missing instructions to the disassembler
    s390/bug: add entry size to the __bug_table section
    s390: use proper expoline sections for .dma code
    s390/nospec: rename assembler generated expoline thunks
    s390: add missing ENDPROC statements to assembler functions
    locking/lockdep: check for freed initmem in static_obj()
    s390/kernel: add support for kernel address space layout randomization (KASLR)
    s390/kernel: introduce .dma sections
    s390/sclp: do not use static sccbs
    s390/kprobes: use static buffer for insn_page
    s390/kernel: convert SYSCALL and PGM_CHECK handlers to .quad
    s390/kernel: build a relocatable kernel
    ...

    Linus Torvalds
     
  • Pull x86 mm updates from Ingo Molnar:
    "The changes in here are:

    - text_poke() fixes and an extensive set of executability lockdowns,
    to (hopefully) eliminate the last residual circumstances under
    which we are using W|X mappings even temporarily on x86 kernels.
    This required a broad range of surgery in text patching facilities,
    module loading, trampoline handling and other bits.

    - tweak page fault messages to be more informative and more
    structured.

    - remove DISCONTIGMEM support on x86-32 and make SPARSEMEM the
    default.

    - reduce KASLR granularity on 5-level paging kernels from 512 GB to
    1 GB.

    - misc other changes and updates"

    * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    x86/mm: Initialize PGD cache during mm initialization
    x86/alternatives: Add comment about module removal races
    x86/kprobes: Use vmalloc special flag
    x86/ftrace: Use vmalloc special flag
    bpf: Use vmalloc special flag
    modules: Use vmalloc special flag
    mm/vmalloc: Add flag for freeing of special permsissions
    mm/hibernation: Make hibernation handle unmapped pages
    x86/mm/cpa: Add set_direct_map_*() functions
    x86/alternatives: Remove the return value of text_poke_*()
    x86/jump-label: Remove support for custom text poker
    x86/modules: Avoid breaking W^X while loading modules
    x86/kprobes: Set instruction page as executable
    x86/ftrace: Set trampoline pages as executable
    x86/kgdb: Avoid redundant comparison of patched code
    x86/alternatives: Use temporary mm for text poking
    x86/alternatives: Initialize temporary mm for patching
    fork: Provide a function for copying init_mm
    uprobes: Initialize uprobes earlier
    x86/mm: Save debug registers when loading a temporary mm
    ...

    Linus Torvalds
     
  • Pull locking updates from Ingo Molnar:
    "Here are the locking changes in this cycle:

    - rwsem unification and simpler micro-optimizations to prepare for
    more intrusive (and more lucrative) scalability improvements in
    v5.3 (Waiman Long)

    - Lockdep irq state tracking flag usage cleanups (Frederic
    Weisbecker)

    - static key improvements (Jakub Kicinski, Peter Zijlstra)

    - misc updates, cleanups and smaller fixes"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
    locking/lockdep: Remove unnecessary unlikely()
    locking/static_key: Don't take sleeping locks in __static_key_slow_dec_deferred()
    locking/static_key: Factor out the fast path of static_key_slow_dec()
    locking/static_key: Add support for deferred static branches
    locking/lockdep: Test all incompatible scenarios at once in check_irq_usage()
    locking/lockdep: Avoid bogus Clang warning
    locking/lockdep: Generate LOCKF_ bit composites
    locking/lockdep: Use expanded masks on find_usage_*() functions
    locking/lockdep: Map remaining magic numbers to lock usage mask names
    locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
    locking/rwsem: Prevent unneeded warning during locking selftest
    locking/rwsem: Optimize rwsem structure for uncontended lock acquisition
    locking/rwsem: Enable lock event counting
    locking/lock_events: Don't show pvqspinlock events on bare metal
    locking/lock_events: Make lock_events available for all archs & other locks
    locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs
    locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro
    locking/rwsem: Add debug check for __down_read*()
    locking/rwsem: Micro-optimize rwsem_try_read_lock_unqueued()
    locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h
    ...

    Linus Torvalds
     
  • Pull unified TLB flushing from Ingo Molnar:
    "This contains the generic mmu_gather feature from Peter Zijlstra,
    which is an all-arch unification of TLB flushing APIs, via the
    following (broad) steps:

    - enhance the APIs to cover more arch details

    - convert most TLB flushing arch implementations to the generic
    APIs.

    - remove leftovers of per arch implementations

    After this series every single architecture makes use of the unified
    TLB flushing APIs"

    * 'core-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    mm/resource: Use resource_overlaps() to simplify region_intersects()
    ia64/tlb: Eradicate tlb_migrate_finish() callback
    asm-generic/tlb: Remove tlb_table_flush()
    asm-generic/tlb: Remove tlb_flush_mmu_free()
    asm-generic/tlb: Remove CONFIG_HAVE_GENERIC_MMU_GATHER
    asm-generic/tlb: Remove arch_tlb*_mmu()
    s390/tlb: Convert to generic mmu_gather
    asm-generic/tlb: Introduce CONFIG_HAVE_MMU_GATHER_NO_GATHER=y
    arch/tlb: Clean up simple architectures
    um/tlb: Convert to generic mmu_gather
    sh/tlb: Convert SH to generic mmu_gather
    ia64/tlb: Convert to generic mmu_gather
    arm/tlb: Convert to generic mmu_gather
    asm-generic/tlb, arch: Invert CONFIG_HAVE_RCU_TABLE_INVALIDATE
    asm-generic/tlb, ia64: Conditionally provide tlb_migrate_finish()
    asm-generic/tlb: Provide generic tlb_flush() based on flush_tlb_mm()
    asm-generic/tlb, arch: Provide generic tlb_flush() based on flush_tlb_range()
    asm-generic/tlb, arch: Provide generic VIPT cache flush
    asm-generic/tlb, arch: Provide CONFIG_HAVE_MMU_GATHER_PAGE_SIZE
    asm-generic/tlb: Provide a comment

    Linus Torvalds
     

06 May, 2019

1 commit

  • Poking-mm initialization might require to duplicate the PGD in early
    stage. Initialize the PGD cache earlier to prevent boot failures.

    Reported-by: kernel test robot
    Signed-off-by: Nadav Amit
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rick Edgecombe
    Cc: Rik van Riel
    Cc: Stephen Rothwell
    Cc: Thomas Gleixner
    Fixes: 4fc19708b165 ("x86/alternatives: Initialize temporary mm for patching")
    Link: http://lkml.kernel.org/r/20190505011124.39692-1-namit@vmware.com
    Signed-off-by: Ingo Molnar

    Nadav Amit
     

30 Apr, 2019

1 commit

  • x86 has an nmi_uaccess_okay(), but other architectures do not.
    Arch-independent code might need to know whether access to user
    addresses is ok in an NMI context or in other code whose execution
    context is unknown. Specifically, this function is needed for
    bpf_probe_write_user().

    Add a default implementation of nmi_uaccess_okay() for architectures
    that do not have such a function.

    Signed-off-by: Nadav Amit
    Signed-off-by: Rick Edgecombe
    Signed-off-by: Peter Zijlstra (Intel)
    Cc:
    Cc:
    Cc:
    Cc:
    Cc:
    Cc:
    Cc:
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190426001143.4983-23-namit@vmware.com
    Signed-off-by: Ingo Molnar

    Nadav Amit
     

29 Apr, 2019

1 commit

  • The following warning occurred on s390:
    WARNING: CPU: 0 PID: 804 at kernel/locking/lockdep.c:1025 lockdep_register_key+0x30/0x150

    This is because the check in static_obj() assumes that all memory within
    [_stext, _end] belongs to static objects, which at least for s390 isn't
    true. The init section is also part of this range, and freeing it allows
    the buddy allocator to allocate memory from it. We have virt == phys for
    the kernel on s390, so that such allocations would then have addresses
    within the range [_stext, _end].

    To fix this, introduce arch_is_kernel_initmem_freed(), similar to
    arch_is_kernel_text/data(), and add it to the checks in static_obj().
    This will always return 0 on architectures that do not define
    arch_is_kernel_initmem_freed. On s390, it will return 1 if initmem has
    been freed and the address is in the range [__init_begin, __init_end].

    Signed-off-by: Gerald Schaefer
    Reviewed-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Gerald Schaefer
     

26 Apr, 2019

1 commit


24 Apr, 2019

4 commits


10 Apr, 2019

1 commit


08 Apr, 2019

3 commits


05 Apr, 2019

2 commits

  • After removing the start and count arguments of syscall_get_arguments() it
    seems reasonable to remove them from syscall_set_arguments(). Note, as of
    today, there are no users of syscall_set_arguments(). But we are told that
    there will be soon. But for now, at least make it consistent with
    syscall_get_arguments().

    Link: http://lkml.kernel.org/r/20190327222014.GA32540@altlinux.org

    Cc: Oleg Nesterov
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Dominik Brodowski
    Cc: Dave Martin
    Cc: "Dmitry V. Levin"
    Cc: x86@kernel.org
    Cc: linux-snps-arc@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Cc: linux-hexagon@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-mips@vger.kernel.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linux-arch@vger.kernel.org
    Acked-by: Max Filippov # For xtensa changes
    Acked-by: Will Deacon # For the arm64 bits
    Reviewed-by: Thomas Gleixner # for x86
    Reviewed-by: Dmitry V. Levin
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
    function call syscall_get_arguments() implemented in x86 was horribly
    written and not optimized for the standard case of passing in 0 and 6 for
    the starting index and the number of system calls to get. When looking at
    all the users of this function, I discovered that all instances pass in only
    0 and 6 for these arguments. Instead of having this function handle
    different cases that are never used, simply rewrite it to return the first 6
    arguments of a system call.

    This should help out the performance of tracing system calls by ptrace,
    ftrace and perf.

    Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org

    Cc: Oleg Nesterov
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Dominik Brodowski
    Cc: Dave Martin
    Cc: "Dmitry V. Levin"
    Cc: x86@kernel.org
    Cc: linux-snps-arc@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Cc: linux-hexagon@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-mips@vger.kernel.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linux-arch@vger.kernel.org
    Acked-by: Paul Burton # MIPS parts
    Acked-by: Max Filippov # For xtensa changes
    Acked-by: Will Deacon # For the arm64 bits
    Reviewed-by: Thomas Gleixner # for x86
    Reviewed-by: Dmitry V. Levin
    Reported-by: Andy Lutomirski
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (Red Hat)
     

03 Apr, 2019

11 commits

  • As the generic rwsem-xadd code is using the appropriate acquire and
    release versions of the atomic operations, the arch specific rwsem.h
    files will not be that much faster than the generic code as long as the
    atomic functions are properly implemented. So we can remove those arch
    specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
    effort.

    Currently, only x86, alpha and ia64 have implemented architecture
    specific fast paths. I don't have access to alpha and ia64 systems for
    testing, but they are legacy systems that are not likely to be updated
    to the latest kernel anyway.

    By using a rwsem microbenchmark, the total locking rates on a 4-socket
    56-core 112-thread x86-64 system before and after the patch were as
    follows (mixed means equal # of read and write locks):

    Before Patch After Patch
    # of Threads wlock rlock mixed wlock rlock mixed
    ------------ ----- ----- ----- ----- ----- -----
    1 29,201 30,143 29,458 28,615 30,172 29,201
    2 6,807 13,299 1,171 7,725 15,025 1,804
    4 6,504 12,755 1,520 7,127 14,286 1,345
    8 6,762 13,412 764 6,826 13,652 726
    16 6,693 15,408 662 6,599 15,938 626
    32 6,145 15,286 496 5,549 15,487 511
    64 5,812 15,495 60 5,858 15,572 60

    There were some run-to-run variations for the multi-thread tests. For
    x86-64, using the generic C code fast path seems to be a little bit
    faster than the assembly version with low lock contention. Looking at
    the assembly version of the fast paths, there are assembly to/from C
    code wrappers that save and restore all the callee-clobbered registers
    (7 registers on x86-64). The assembly generated from the generic C
    code doesn't need to do that. That may explain the slight performance
    gain here.

    The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
    with no code change as no other code other than those under
    kernel/locking needs to access the internal rwsem macros and functions.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Only ia64-sn2 uses this as an optimization, and there it is of
    questionable correctness due to the mm_users==1 test.

    Remove it entirely.

    No change in behavior intended.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • There are no external users of this API (nor should there be); remove it.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • As the comment notes; it is a potentially dangerous operation. Just
    use tlb_flush_mmu(), that will skip the (double) TLB invalidate if
    it really isn't needed anyway.

    No change in behavior intended.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Since all architectures are now using it, it is redundant.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Add the Kconfig option HAVE_MMU_GATHER_NO_GATHER to the generic
    mmu_gather code. If the option is set the mmu_gather will not
    track individual pages for delayed page free anymore. A platform
    that enables the option needs to provide its own implementation
    of the __tlb_remove_page_size() function to free pages.

    No change in behavior intended.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: aneesh.kumar@linux.vnet.ibm.com
    Cc: heiko.carstens@de.ibm.com
    Cc: linux@armlinux.org.uk
    Cc: npiggin@gmail.com
    Link: http://lkml.kernel.org/r/20180918125151.31744-2-schwidefsky@de.ibm.com
    Signed-off-by: Ingo Molnar

    Martin Schwidefsky
     
  • Make issuing a TLB invalidate for page-table pages the normal case.

    The reason is twofold:

    - too many invalidates is safer than too few,
    - most architectures use the linux page-tables natively
    and would thus require this.

    Make it an opt-out, instead of an opt-in.

    No change in behavior intended.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Needed for ia64 -- alternatively we drop the entire hook.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Aneesh Kumar K.V
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • When an architecture does not have (an efficient) flush_tlb_range(),
    but instead always uses full TLB invalidates, the current generic
    tlb_flush() is sub-optimal, for it will generate extra flushes in
    order to keep the range small.

    But if we cannot do range flushes, that is a moot concern. Optionally
    provide this simplified default.

    No change in behavior intended.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Provide a generic tlb_flush() implementation that relies on
    flush_tlb_range(). This is a little awkward because flush_tlb_range()
    assumes a VMA for range invalidation, but we no longer have one.

    Audit of all flush_tlb_range() implementations shows only vma->vm_mm
    and vma->vm_flags are used, and of the latter only VM_EXEC (I-TLB
    invalidates) and VM_HUGETLB (large TLB invalidate) are used.

    Therefore, track VM_EXEC and VM_HUGETLB in two more bits, and create a
    'fake' VMA.

    This allows architectures that have a reasonably efficient
    flush_tlb_range() to not require any additional effort.

    No change in behavior intended.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Aneesh Kumar K.V
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The one obvious thing SH and ARM want is a sensible default for
    tlb_start_vma(). (also: https://lkml.org/lkml/2004/1/15/6 )

    Avoid all VIPT architectures providing their own tlb_start_vma()
    implementation and rely on architectures to provide a no-op
    flush_cache_range() when it is not relevant.

    This patch makes tlb_start_vma() default to flush_cache_range(), which
    should be right and sufficient. The only exceptions that I found where
    (oddly):

    - m68k-mmu
    - sparc64
    - unicore

    Those architectures appear to have flush_cache_range(), but their
    current tlb_start_vma() does not call it.

    No change in behavior intended.

    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Will Deacon
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Aneesh Kumar K.V
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: David Miller
    Cc: Guan Xuetao
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Nick Piggin
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra