24 Aug, 2018

3 commits

  • Merge fixes for missing TLB shootdowns.

    This fixes a couple of cases that involved us possibly freeing page
    table structures before the required TLB shootdown had been done.

    There are a few cleanup patches to make the code easier to follow, and
    to avoid some of the more problematic cases entirely when not necessary.

    To make this easier for backports, it undoes the recent lazy TLB
    patches, because the cleanups and fixes are more important, and Rik is
    ok with re-doing them later when things have calmed down.

    The missing TLB flush was only delayed, and the wrong ordering only
    happened under memory pressure (and in theory under a couple of other
    fairly theoretical situations), so this may have been all very unlikely
    to have hit people in practice.

    But getting the TLB shootdown wrong is _so_ hard to debug and see that I
    consider this a crticial fix.

    Many thanks to Jann Horn for having debugged this.

    * tlb-fixes:
    x86/mm: Only use tlb_remove_table() for paravirt
    mm: mmu_notifier fix for tlb_end_vma
    mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
    mm/tlb: Remove tlb_remove_table() non-concurrent condition
    mm: move tlb_table_flush to tlb_flush_mmu_free
    x86/mm/tlb: Revert the recent lazy TLB patches

    Linus Torvalds
     
  • Pull MIPS fixes from Paul Burton:

    - Fix microMIPS build failures by adding a .insn directive to the
    barrier_before_unreachable() asm statement in order to convince the
    toolchain that the asm statement is a valid branch target rather
    than a bogus attempt to switch ISA.

    - Clean up our declarations of TLB functions that we overwrite with
    generated code in order to prevent the compiler making assumptions
    about alignment that cause microMIPS kernels built with GCC 7 &
    above to die early during boot.

    - Fix up a regression for MIPS32 kernels which slipped into the main
    MIPS pull for 4.19, causing CONFIG_32BIT=y kernels to contain
    inappropriate MIPS64 instructions.

    - Extend our existing workaround for MIPSr6 builds that end up using
    the __multi3 intrinsic to GCC 7 & below, rather than just GCC 7.

    * tag 'mips_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
    MIPS: lib: Provide MIPS64r6 __multi3() for GCC < 7
    MIPS: Workaround GCC __builtin_unreachable reordering bug
    compiler.h: Allow arch-specific asm/compiler.h
    MIPS: Avoid move psuedo-instruction whilst using MIPS_ISA_LEVEL
    MIPS: Consistently declare TLB functions
    MIPS: Export tlbmiss_handler_setup_pgd near its definition

    Linus Torvalds
     
  • Jann reported that x86 was missing required TLB invalidates when he
    hit the !*batch slow path in tlb_remove_table().

    This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache)
    invalidates, the PowerPC-hash where this code originated and the
    Sparc-hash where this was subsequently used did not need that. ARM
    which later used this put an explicit TLB invalidate in their
    __p*_free_tlb() functions, and PowerPC-radix followed that example.

    But when we hooked up x86 we failed to consider this. Fix this by
    (optionally) hooking tlb_remove_table() into the TLB invalidate code.

    NOTE: s390 was also needing something like this and might now
    be able to use the generic code again.

    [ Modified to be on top of Nick's cleanups, which simplified this patch
    now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]

    Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)")
    Reported-by: Jann Horn
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Rik van Riel
    Cc: Nicholas Piggin
    Cc: David Miller
    Cc: Will Deacon
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

23 Aug, 2018

1 commit

  • Patch series "add support for relative references in special sections", v10.

    This adds support for emitting special sections such as initcall arrays,
    PCI fixups and tracepoints as relative references rather than absolute
    references. This reduces the size by 50% on 64-bit architectures, but
    more importantly, it removes the need for carrying relocation metadata for
    these sections in relocatable kernels (e.g., for KASLR) that needs to be
    fixed up at boot time. On arm64, this reduces the vmlinux footprint of
    such a reference by 8x (8 byte absolute reference + 24 byte RELA entry vs
    4 byte relative reference)

    Patch #3 was sent out before as a single patch. This series supersedes
    the previous submission. This version makes relative ksymtab entries
    dependent on the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS rather
    than trying to infer from kbuild test robot replies for which
    architectures it should be blacklisted.

    Patch #1 introduces the new Kconfig symbol HAVE_ARCH_PREL32_RELOCATIONS,
    and sets it for the main architectures that are expected to benefit the
    most from this feature, i.e., 64-bit architectures or ones that use
    runtime relocations.

    Patch #2 add support for #define'ing __DISABLE_EXPORTS to get rid of
    ksymtab/kcrctab sections in decompressor and EFI stub objects when
    rebuilding existing C files to run in a different context.

    Patches #4 - #6 implement relative references for initcalls, PCI fixups
    and tracepoints, respectively, all of which produce sections with order
    ~1000 entries on an arm64 defconfig kernel with tracing enabled. This
    means we save about 28 KB of vmlinux space for each of these patches.

    [From the v7 series blurb, which included the jump_label patches as well]:

    For the arm64 kernel, all patches combined reduce the memory footprint
    of vmlinux by about 1.3 MB (using a config copied from Ubuntu that has
    KASLR enabled), of which ~1 MB is the size reduction of the RELA section
    in .init, and the remaining 300 KB is reduction of .text/.data.

    This patch (of 6):

    Before updating certain subsystems to use place relative 32-bit
    relocations in special sections, to save space and reduce the number of
    absolute relocations that need to be processed at runtime by relocatable
    kernels, introduce the Kconfig symbol and define it for some architectures
    that should be able to support and benefit from it.

    Link: http://lkml.kernel.org/r/20180704083651.24360-2-ard.biesheuvel@linaro.org
    Signed-off-by: Ard Biesheuvel
    Acked-by: Michael Ellerman
    Reviewed-by: Will Deacon
    Acked-by: Ingo Molnar
    Cc: Arnd Bergmann
    Cc: Kees Cook
    Cc: Thomas Garnier
    Cc: Thomas Gleixner
    Cc: "Serge E. Hallyn"
    Cc: Bjorn Helgaas
    Cc: Benjamin Herrenschmidt
    Cc: Russell King
    Cc: Paul Mackerras
    Cc: Catalin Marinas
    Cc: Petr Mladek
    Cc: James Morris
    Cc: Nicolas Pitre
    Cc: Josh Poimboeuf
    Cc: Steven Rostedt
    Cc: Sergey Senozhatsky ,
    Cc: James Morris
    Cc: Jessica Yu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ard Biesheuvel
     

22 Aug, 2018

1 commit

  • We have a need to override the definition of
    barrier_before_unreachable() for MIPS, which means we either need to add
    architecture-specific code into linux/compiler-gcc.h or we need to allow
    the architecture to provide a header that can define the macro before
    the generic definition. The latter seems like the better approach.

    A straightforward approach to the per-arch header is to make use of
    asm-generic to provide a default empty header & adjust architectures
    which don't need anything specific to make use of that by adding the
    header to generic-y. Unfortunately this doesn't work so well due to
    commit 28128c61e08e ("kconfig.h: Include compiler types to avoid missed
    struct attributes") which caused linux/compiler_types.h to be included
    in the compilation of every C file via the -include linux/kconfig.h flag
    in c_flags.

    Because the -include flag is present for all C files we compile, we need
    the architecture-provided header to be present before any C files are
    compiled. If any C files can be compiled prior to the asm-generic header
    wrappers being generated then we hit a build failure due to missing
    header. Such cases do exist - one pointed out by the kbuild test robot
    is the compilation of arch/ia64/kernel/nr-irqs.c, which occurs as part
    of the archprepare target [1].

    This leaves us with a few options:

    1) Use generic-y & fix any build failures we find by enforcing
    ordering such that the asm-generic target occurs before any C
    compilation, such that linux/compiler_types.h can always include
    the generated asm-generic wrapper which in turn includes the empty
    asm-generic header. This would rely on us finding all the
    problematic cases - I don't know for sure that the ia64 issue is
    the only one.

    2) Add an actual empty header to each architecture, so that we don't
    need the generated asm-generic wrapper. This seems messy.

    3) Give up & add #ifdef CONFIG_MIPS or similar to
    linux/compiler_types.h. This seems messy too.

    4) Include the arch header only when it's actually needed, removing
    the need for the asm-generic wrapper for all other architectures.

    This patch allows us to use approach 4, by including an asm/compiler.h
    header from linux/compiler_types.h after the inclusion of the
    compiler-specific linux/compiler-*.h header(s). We do this
    conditionally, only when CONFIG_HAVE_ARCH_COMPILER_H is selected, in
    order to avoid the need for asm-generic wrappers & the associated build
    ordering issue described above. The asm/compiler.h header is included
    after the generic linux/compiler-*.h header(s) for consistency with the
    way linux/compiler-intel.h & linux/compiler-clang.h are included after
    the linux/compiler-gcc.h header that they override.

    [1] https://lists.01.org/pipermail/kbuild-all/2018-August/051175.html

    Signed-off-by: Paul Burton
    Reviewed-by: Masahiro Yamada
    Patchwork: https://patchwork.linux-mips.org/patch/20269/
    Cc: Arnd Bergmann
    Cc: James Hogan
    Cc: Masahiro Yamada
    Cc: Ralf Baechle
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kbuild@vger.kernel.org
    Cc: linux-mips@linux-mips.org

    Paul Burton
     

16 Aug, 2018

2 commits

  • Pull Kconfig consolidation from Masahiro Yamada:
    "Consolidation of Kconfig files by Christoph Hellwig.

    Move the source statements of arch-independent Kconfig files instead
    of duplicating the includes in every arch/$(SRCARCH)/Kconfig"

    * tag 'kconfig-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kconfig: add a Memory Management options" menu
    kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt
    kconfig: use a menu in arch/Kconfig to reduce clutter
    kconfig: include kernel/Kconfig.preempt from init/Kconfig
    Kconfig: consolidate the "Kernel hacking" menu
    kconfig: include common Kconfig files from top-level Kconfig
    kconfig: remove duplicate SWAP symbol defintions
    um: create a proper drivers Kconfig
    um: cleanup Kconfig files
    um: stop abusing KBUILD_KCONFIG

    Linus Torvalds
     
  • Pull gcc plugin cleanups from Kees Cook:

    - Kconfig and Makefile clean-ups (Masahiro Yamada, Kees Cook)

    - gcc-common.h definition clean-ups (Alexander Popov)

    * tag 'gcc-plugin-cleanup-v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    gcc-plugins: Clean up the cgraph_create_edge* macros
    gcc-plugins: Regularize Makefile.gcc-plugins
    gcc-plugins: split out Kconfig entries to scripts/gcc-plugins/Kconfig
    gcc-plugins: remove unused GCC_PLUGIN_SUBDIR

    Linus Torvalds
     

02 Aug, 2018

3 commits


25 Jul, 2018

1 commit


21 Jun, 2018

1 commit

  • Provide a command line and a sysfs knob to control SMT.

    The command line options are:

    'nosmt': Enumerate secondary threads, but do not online them

    'nosmt=force': Ignore secondary threads completely during enumeration
    via MP table and ACPI/MADT.

    The sysfs control file has the following states (read/write):

    'on': SMT is enabled. Secondary threads can be freely onlined
    'off': SMT is disabled. Secondary threads, even if enumerated
    cannot be onlined
    'forceoff': SMT is permanentely disabled. Writes to the control
    file are rejected.
    'notsupported': SMT is not supported by the CPU

    The command line option 'nosmt' sets the sysfs control to 'off'. This
    can be changed to 'on' to reenable SMT during runtime.

    The command line option 'nosmt=force' sets the sysfs control to
    'forceoff'. This cannot be changed during runtime.

    When SMT is 'on' and the control file is changed to 'off' then all online
    secondary threads are offlined and attempts to online a secondary thread
    later on are rejected.

    When SMT is 'off' and the control file is changed to 'on' then secondary
    threads can be onlined again. The 'off' -> 'on' transition does not
    automatically online the secondary threads.

    When the control file is set to 'forceoff', the behaviour is the same as
    setting it to 'off', but the operation is irreversible and later writes to
    the control file are rejected.

    When the control status is 'notsupported' then writes to the control file
    are rejected.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Konrad Rzeszutek Wilk
    Acked-by: Ingo Molnar

    Thomas Gleixner
     

16 Jun, 2018

1 commit

  • As we move stuff around, some doc references are broken. Fix some of
    them via this script:
    ./scripts/documentation-file-ref-check --fix

    Manually checked if the produced result is valid, removing a few
    false-positives.

    Acked-by: Takashi Iwai
    Acked-by: Masami Hiramatsu
    Acked-by: Stephen Boyd
    Acked-by: Charles Keepax
    Acked-by: Mathieu Poirier
    Reviewed-by: Coly Li
    Signed-off-by: Mauro Carvalho Chehab
    Acked-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

15 Jun, 2018

1 commit

  • HAVE_CC_STACKPROTECTOR should be selected by architectures with stack
    canary implementation. It is not about the compiler support.

    For the consistency with commit 050e9baa9dc9 ("Kbuild: rename
    CC_STACKPROTECTOR[_STRONG] config variables"), remove 'CC_' from the
    config symbol.

    I moved the 'select' lines to keep the alphabetical sorting.

    Signed-off-by: Masahiro Yamada
    Acked-by: Kees Cook
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     

14 Jun, 2018

1 commit

  • The changes to automatically test for working stack protector compiler
    support in the Kconfig files removed the special STACKPROTECTOR_AUTO
    option that picked the strongest stack protector that the compiler
    supported.

    That was all a nice cleanup - it makes no sense to have the AUTO case
    now that the Kconfig phase can just determine the compiler support
    directly.

    HOWEVER.

    It also meant that doing "make oldconfig" would now _disable_ the strong
    stackprotector if you had AUTO enabled, because in a legacy config file,
    the sane stack protector configuration would look like

    CONFIG_HAVE_CC_STACKPROTECTOR=y
    # CONFIG_CC_STACKPROTECTOR_NONE is not set
    # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
    # CONFIG_CC_STACKPROTECTOR_STRONG is not set
    CONFIG_CC_STACKPROTECTOR_AUTO=y

    and when you ran this through "make oldconfig" with the Kbuild changes,
    it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
    been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
    CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
    used to be disabled (because it was really enabled by AUTO), and would
    disable it in the new config, resulting in:

    CONFIG_HAVE_CC_STACKPROTECTOR=y
    CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
    CONFIG_CC_STACKPROTECTOR=y
    # CONFIG_CC_STACKPROTECTOR_STRONG is not set
    CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

    That's dangerously subtle - people could suddenly find themselves with
    the weaker stack protector setup without even realizing.

    The solution here is to just rename not just the old RECULAR stack
    protector option, but also the strong one. This does that by just
    removing the CC_ prefix entirely for the user choices, because it really
    is not about the compiler support (the compiler support now instead
    automatially impacts _visibility_ of the options to users).

    This results in "make oldconfig" actually asking the user for their
    choice, so that we don't have any silent subtle security model changes.
    The end result would generally look like this:

    CONFIG_HAVE_CC_STACKPROTECTOR=y
    CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
    CONFIG_STACKPROTECTOR=y
    CONFIG_STACKPROTECTOR_STRONG=y
    CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

    where the "CC_" versions really are about internal compiler
    infrastructure, not the user selections.

    Acked-by: Masahiro Yamada
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Jun, 2018

1 commit

  • Pull more Kbuild updates from Masahiro Yamada:

    - fix some bugs introduced by the recent Kconfig syntax extension

    - add some symbols about compiler information in Kconfig, such as
    CC_IS_GCC, CC_IS_CLANG, GCC_VERSION, etc.

    - test compiler capability for the stack protector in Kconfig, and
    clean-up Makefile

    - test compiler capability for GCC-plugins in Kconfig, and clean-up
    Makefile

    - allow to enable GCC-plugins for COMPILE_TEST

    - test compiler capability for KCOV in Kconfig and correct dependency

    - remove auto-detect mode of the GCOV format, which is now more nicely
    handled in Kconfig

    - test compiler capability for mprofile-kernel on PowerPC, and clean-up
    Makefile

    - misc cleanups

    * tag 'kbuild-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    linux/linkage.h: replace VMLINUX_SYMBOL_STR() with __stringify()
    kconfig: fix localmodconfig
    sh: remove no-op macro VMLINUX_SYMBOL()
    powerpc/kbuild: move -mprofile-kernel check to Kconfig
    Documentation: kconfig: add recommended way to describe compiler support
    gcc-plugins: disable GCC_PLUGIN_STRUCTLEAK_BYREF_ALL for COMPILE_TEST
    gcc-plugins: allow to enable GCC_PLUGINS for COMPILE_TEST
    gcc-plugins: test plugin support in Kconfig and clean up Makefile
    gcc-plugins: move GCC version check for PowerPC to Kconfig
    kcov: test compiler capability in Kconfig and correct dependency
    gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT
    arm64: move GCC version check for ARCH_SUPPORTS_INT128 to Kconfig
    kconfig: add CC_IS_CLANG and CLANG_VERSION
    kconfig: add CC_IS_GCC and GCC_VERSION
    stack-protector: test compiler capability in Kconfig and drop AUTO mode
    kbuild: fix endless syncconfig in case arch Makefile sets CROSS_COMPILE

    Linus Torvalds
     

11 Jun, 2018

4 commits

  • We have enabled GCC_PLUGINS for COMPILE_TEST, but allmodconfig now
    produces new warnings.

    CC [M] drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.o
    drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c: In function ‘wlc_phy_workarounds_nphy_rev7’:
    drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:16563:1: warning: the frame size of 3128 bytes is larger than 2048 bytes [-Wframe-larger-than=]
    }
    ^
    drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c: In function ‘wlc_phy_workarounds_nphy_rev3’:
    drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:16905:1: warning: the frame size of 2800 bytes is larger than 2048 bytes [-Wframe-larger-than=]
    }
    ^
    drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c: In function ‘wlc_phy_cal_txiqlo_nphy’:
    drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:26033:1: warning: the frame size of 2488 bytes is larger than 2048 bytes [-Wframe-larger-than=]
    }
    ^

    It looks like GCC_PLUGIN_STRUCTLEAK_BYREF_ALL is causing this.
    Add "depends on !COMPILE_TEST" to not dirturb the compile test.

    Reported-by: Stephen Rothwell
    Suggested-by: Kees Cook
    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • Now that the compiler's plugin support is checked in Kconfig,
    all{yes,mod}config will not be bothered.

    Remove 'depends on !COMPILE_TEST' for GCC_PLUGINS.

    'depends on !COMPILE_TEST' for the following three are still kept:
    GCC_PLUGIN_CYC_COMPLEXITY
    GCC_PLUGIN_STRUCTLEAK_VERBOSE
    GCC_PLUGIN_RANDSTRUCT_PERFORMANCE

    Kees suggested to do so because the first two are too noisy, and the
    last one would reduce the compile test coverage. I commented the
    reasons in arch/Kconfig.

    Signed-off-by: Masahiro Yamada
    Acked-by: Kees Cook

    Masahiro Yamada
     
  • Run scripts/gcc-plugin.sh from Kconfig so that users can enable
    GCC_PLUGINS only when the compiler supports building plugins.

    Kconfig defines a new symbol, PLUGIN_HOSTCC. This will contain
    the compiler (g++ or gcc) used for building plugins, or empty
    if the plugin can not be supported at all.

    This allows us to remove all ugly testing in Makefile.gcc-plugins.

    Signed-off-by: Masahiro Yamada
    Acked-by: Kees Cook

    Masahiro Yamada
     
  • Pull restartable sequence support from Thomas Gleixner:
    "The restartable sequences syscall (finally):

    After a lot of back and forth discussion and massive delays caused by
    the speculative distraction of maintainers, the core set of
    restartable sequences has finally reached a consensus.

    It comes with the basic non disputed core implementation along with
    support for arm, powerpc and x86 and a full set of selftests

    It was exposed to linux-next earlier this week, so it does not fully
    comply with the merge window requirements, but there is really no
    point to drag it out for yet another cycle"

    * 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    rseq/selftests: Provide Makefile, scripts, gitignore
    rseq/selftests: Provide parametrized tests
    rseq/selftests: Provide basic percpu ops test
    rseq/selftests: Provide basic test
    rseq/selftests: Provide rseq library
    selftests/lib.mk: Introduce OVERRIDE_TARGETS
    powerpc: Wire up restartable sequences system call
    powerpc: Add syscall detection for restartable sequences
    powerpc: Add support for restartable sequences
    x86: Wire up restartable sequence system call
    x86: Add support for restartable sequences
    arm: Wire up restartable sequences system call
    arm: Add syscall detection for restartable sequences
    arm: Add restartable sequences support
    rseq: Introduce restartable sequences system call
    uapi/headers: Provide types_32_64.h

    Linus Torvalds
     

08 Jun, 2018

1 commit

  • Move the test for -fstack-protector(-strong) option to Kconfig.

    If the compiler does not support the option, the corresponding menu
    is automatically hidden. If STRONG is not supported, it will fall
    back to REGULAR. If REGULAR is not supported, it will be disabled.
    This means, AUTO is implicitly handled by the dependency solver of
    Kconfig, hence removed.

    I also turned the 'choice' into only two boolean symbols. The use of
    'choice' is not a good idea here, because all of all{yes,mod,no}config
    would choose the first visible value, while we want allnoconfig to
    disable as many features as possible.

    X86 has additional shell scripts in case the compiler supports those
    options, but generates broken code. I added CC_HAS_SANE_STACKPROTECTOR
    to test this. I had to add -m32 to gcc-x86_32-has-stack-protector.sh
    to make it work correctly.

    Signed-off-by: Masahiro Yamada
    Acked-by: Kees Cook

    Masahiro Yamada
     

07 Jun, 2018

1 commit

  • Pull Kbuild updates from Masahiro Yamada:

    - improve fixdep to coalesce consecutive slashes in dep-files

    - fix some issues of the maintainer string generation in deb-pkg script

    - remove unused CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX and clean-up
    several tools and linker scripts

    - clean-up modpost

    - allow to enable the dead code/data elimination for PowerPC in EXPERT
    mode

    - improve two coccinelle scripts for better performance

    - pass endianness and machine size flags to sparse for all architecture

    - misc fixes

    * tag 'kbuild-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits)
    kbuild: add machine size to CHECKFLAGS
    kbuild: add endianness flag to CHEKCFLAGS
    kbuild: $(CHECK) doesnt need NOSTDINC_FLAGS twice
    scripts: Fixed printf format mismatch
    scripts/tags.sh: use `find` for $ALLSOURCE_ARCHS generation
    coccinelle: deref_null: improve performance
    coccinelle: mini_lock: improve performance
    powerpc: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selected
    kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION to be selectable if enabled
    kbuild: LD_DEAD_CODE_DATA_ELIMINATION no -ffunction-sections/-fdata-sections for module build
    kbuild: Fix asm-generic/vmlinux.lds.h for LD_DEAD_CODE_DATA_ELIMINATION
    modpost: constify *modname function argument where possible
    modpost: remove redundant is_vmlinux() test
    modpost: use strstarts() helper more widely
    modpost: pass struct elf_info pointer to get_modinfo()
    checkpatch: remove VMLINUX_SYMBOL() check
    vmlinux.lds.h: remove no-op macro VMLINUX_SYMBOL()
    kbuild: remove CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX
    export.h: remove code for prefixing symbols with underscore
    depmod.sh: remove symbol prefix support
    ...

    Linus Torvalds
     

06 Jun, 2018

1 commit

  • Expose a new system call allowing each thread to register one userspace
    memory area to be used as an ABI between kernel and user-space for two
    purposes: user-space restartable sequences and quick access to read the
    current CPU number value from user-space.

    * Restartable sequences (per-cpu atomics)

    Restartables sequences allow user-space to perform update operations on
    per-cpu data without requiring heavy-weight atomic operations.

    The restartable critical sections (percpu atomics) work has been started
    by Paul Turner and Andrew Hunter. It lets the kernel handle restart of
    critical sections. [1] [2] The re-implementation proposed here brings a
    few simplifications to the ABI which facilitates porting to other
    architectures and speeds up the user-space fast path.

    Here are benchmarks of various rseq use-cases.

    Test hardware:

    arm32: ARMv7 Processor rev 4 (v7l) "Cubietruck", 2-core
    x86-64: Intel E5-2630 v3@2.40GHz, 16-core, hyperthreading

    The following benchmarks were all performed on a single thread.

    * Per-CPU statistic counter increment

    getcpu+atomic (ns/op) rseq (ns/op) speedup
    arm32: 344.0 31.4 11.0
    x86-64: 15.3 2.0 7.7

    * LTTng-UST: write event 32-bit header, 32-bit payload into tracer
    per-cpu buffer

    getcpu+atomic (ns/op) rseq (ns/op) speedup
    arm32: 2502.0 2250.0 1.1
    x86-64: 117.4 98.0 1.2

    * liburcu percpu: lock-unlock pair, dereference, read/compare word

    getcpu+atomic (ns/op) rseq (ns/op) speedup
    arm32: 751.0 128.5 5.8
    x86-64: 53.4 28.6 1.9

    * jemalloc memory allocator adapted to use rseq

    Using rseq with per-cpu memory pools in jemalloc at Facebook (based on
    rseq 2016 implementation):

    The production workload response-time has 1-2% gain avg. latency, and
    the P99 overall latency drops by 2-3%.

    * Reading the current CPU number

    Speeding up reading the current CPU number on which the caller thread is
    running is done by keeping the current CPU number up do date within the
    cpu_id field of the memory area registered by the thread. This is done
    by making scheduler preemption set the TIF_NOTIFY_RESUME flag on the
    current thread. Upon return to user-space, a notify-resume handler
    updates the current CPU value within the registered user-space memory
    area. User-space can then read the current CPU number directly from
    memory.

    Keeping the current cpu id in a memory area shared between kernel and
    user-space is an improvement over current mechanisms available to read
    the current CPU number, which has the following benefits over
    alternative approaches:

    - 35x speedup on ARM vs system call through glibc
    - 20x speedup on x86 compared to calling glibc, which calls vdso
    executing a "lsl" instruction,
    - 14x speedup on x86 compared to inlined "lsl" instruction,
    - Unlike vdso approaches, this cpu_id value can be read from an inline
    assembly, which makes it a useful building block for restartable
    sequences.
    - The approach of reading the cpu id through memory mapping shared
    between kernel and user-space is portable (e.g. ARM), which is not the
    case for the lsl-based x86 vdso.

    On x86, yet another possible approach would be to use the gs segment
    selector to point to user-space per-cpu data. This approach performs
    similarly to the cpu id cache, but it has two disadvantages: it is
    not portable, and it is incompatible with existing applications already
    using the gs segment selector for other purposes.

    Benchmarking various approaches for reading the current CPU number:

    ARMv7 Processor rev 4 (v7l)
    Machine model: Cubietruck
    - Baseline (empty loop): 8.4 ns
    - Read CPU from rseq cpu_id: 16.7 ns
    - Read CPU from rseq cpu_id (lazy register): 19.8 ns
    - glibc 2.19-0ubuntu6.6 getcpu: 301.8 ns
    - getcpu system call: 234.9 ns

    x86-64 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz:
    - Baseline (empty loop): 0.8 ns
    - Read CPU from rseq cpu_id: 0.8 ns
    - Read CPU from rseq cpu_id (lazy register): 0.8 ns
    - Read using gs segment selector: 0.8 ns
    - "lsl" inline assembly: 13.0 ns
    - glibc 2.19-0ubuntu6 getcpu: 16.6 ns
    - getcpu system call: 53.9 ns

    - Speed (benchmark taken on v8 of patchset)

    Running 10 runs of hackbench -l 100000 seems to indicate, contrary to
    expectations, that enabling CONFIG_RSEQ slightly accelerates the
    scheduler:

    Configuration: 2 sockets * 8-core Intel(R) Xeon(R) CPU E5-2630 v3 @
    2.40GHz (directly on hardware, hyperthreading disabled in BIOS, energy
    saving disabled in BIOS, turboboost disabled in BIOS, cpuidle.off=1
    kernel parameter), with a Linux v4.6 defconfig+localyesconfig,
    restartable sequences series applied.

    * CONFIG_RSEQ=n

    avg.: 41.37 s
    std.dev.: 0.36 s

    * CONFIG_RSEQ=y

    avg.: 40.46 s
    std.dev.: 0.33 s

    - Size

    On x86-64, between CONFIG_RSEQ=n/y, the text size increase of vmlinux is
    567 bytes, and the data size increase of vmlinux is 5696 bytes.

    [1] https://lwn.net/Articles/650333/
    [2] http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Cc: Joel Fernandes
    Cc: Catalin Marinas
    Cc: Dave Watson
    Cc: Will Deacon
    Cc: Andi Kleen
    Cc: "H . Peter Anvin"
    Cc: Chris Lameter
    Cc: Russell King
    Cc: Andrew Hunter
    Cc: Michael Kerrisk
    Cc: "Paul E . McKenney"
    Cc: Paul Turner
    Cc: Boqun Feng
    Cc: Josh Triplett
    Cc: Steven Rostedt
    Cc: Ben Maurer
    Cc: Alexander Viro
    Cc: linux-api@vger.kernel.org
    Cc: Andy Lutomirski
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com
    Link: http://lkml.kernel.org/r/20150624222609.6116.86035.stgit@kitami.mtv.corp.google.com
    Link: https://lkml.kernel.org/r/20180602124408.8430-3-mathieu.desnoyers@efficios.com

    Mathieu Desnoyers
     

05 Jun, 2018

2 commits

  • Pull timers and timekeeping updates from Thomas Gleixner:

    - Core infrastucture work for Y2038 to address the COMPAT interfaces:

    + Add a new Y2038 safe __kernel_timespec and use it in the core
    code

    + Introduce config switches which allow to control the various
    compat mechanisms

    + Use the new config switch in the posix timer code to control the
    32bit compat syscall implementation.

    - Prevent bogus selection of CPU local clocksources which causes an
    endless reselection loop

    - Remove the extra kthread in the clocksource code which has no value
    and just adds another level of indirection

    - The usual bunch of trivial updates, cleanups and fixlets all over the
    place

    - More SPDX conversions

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
    clocksource/drivers/mxs_timer: Switch to SPDX identifier
    clocksource/drivers/timer-imx-tpm: Switch to SPDX identifier
    clocksource/drivers/timer-imx-gpt: Switch to SPDX identifier
    clocksource/drivers/timer-imx-gpt: Remove outdated file path
    clocksource/drivers/arc_timer: Add comments about locking while read GFRC
    clocksource/drivers/mips-gic-timer: Add pr_fmt and reword pr_* messages
    clocksource/drivers/sprd: Fix Kconfig dependency
    clocksource: Move inline keyword to the beginning of function declarations
    timer_list: Remove unused function pointer typedef
    timers: Adjust a kernel-doc comment
    tick: Prefer a lower rating device only if it's CPU local device
    clocksource: Remove kthread
    time: Change nanosleep to safe __kernel_* types
    time: Change types to new y2038 safe __kernel_* types
    time: Fix get_timespec64() for y2038 safe compat interfaces
    time: Add new y2038 safe __kernel_timespec
    posix-timers: Make compat syscalls depend on CONFIG_COMPAT_32BIT_TIME
    time: Introduce CONFIG_COMPAT_32BIT_TIME
    time: Introduce CONFIG_64BIT_TIME in architectures
    compat: Enable compat_get/put_timespec64 always
    ...

    Linus Torvalds
     
  • Pull dma-mapping updates from Christoph Hellwig:

    - replace the force_dma flag with a dma_configure bus method. (Nipun
    Gupta, although one patch is іncorrectly attributed to me due to a
    git rebase bug)

    - use GFP_DMA32 more agressively in dma-direct. (Takashi Iwai)

    - remove PCI_DMA_BUS_IS_PHYS and rely on the dma-mapping API to do the
    right thing for bounce buffering.

    - move dma-debug initialization to common code, and apply a few
    cleanups to the dma-debug code.

    - cleanup the Kconfig mess around swiotlb selection

    - swiotlb comment fixup (Yisheng Xie)

    - a trivial swiotlb fix. (Dan Carpenter)

    - support swiotlb on RISC-V. (based on a patch from Palmer Dabbelt)

    - add a new generic dma-noncoherent dma_map_ops implementation and use
    it for arc, c6x and nds32.

    - improve scatterlist validity checking in dma-debug. (Robin Murphy)

    - add a struct device quirk to limit the dma-mask to 32-bit due to
    bridge/system issues, and switch x86 to use it instead of a local
    hack for VIA bridges.

    - handle devices without a dma_mask more gracefully in the dma-direct
    code.

    * tag 'dma-mapping-4.18' of git://git.infradead.org/users/hch/dma-mapping: (48 commits)
    dma-direct: don't crash on device without dma_mask
    nds32: use generic dma_noncoherent_ops
    nds32: implement the unmap_sg DMA operation
    nds32: consolidate DMA cache maintainance routines
    x86/pci-dma: switch the VIA 32-bit DMA quirk to use the struct device flag
    x86/pci-dma: remove the explicit nodac and allowdac option
    x86/pci-dma: remove the experimental forcesac boot option
    Documentation/x86: remove a stray reference to pci-nommu.c
    core, dma-direct: add a flag 32-bit dma limits
    dma-mapping: remove unused gfp_t parameter to arch_dma_alloc_attrs
    dma-debug: check scatterlist segments
    c6x: use generic dma_noncoherent_ops
    arc: use generic dma_noncoherent_ops
    arc: fix arc_dma_{map,unmap}_page
    arc: fix arc_dma_sync_sg_for_{cpu,device}
    arc: simplify arc_dma_sync_single_for_{cpu,device}
    dma-mapping: provide a generic dma-noncoherent implementation
    dma-mapping: simplify Kconfig dependencies
    riscv: add swiotlb support
    riscv: only enable ZONE_DMA32 for 64-bit
    ...

    Linus Torvalds
     

17 May, 2018

2 commits


12 May, 2018

1 commit

  • Currently STRUCTLEAK inserts initialization out of live scope of variables
    from KASAN point of view. This leads to KASAN false positive reports.
    Prohibit this combination for now.

    Link: http://lkml.kernel.org/r/20180419172451.104700-1-dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Acked-by: Kees Cook
    Cc: Fengguang Wu
    Cc: Sergey Senozhatsky
    Cc: Andrey Ryabinin
    Cc: Dennis Zhou
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     

08 May, 2018

1 commit


19 Apr, 2018

2 commits

  • Compat functions are now used to support 32 bit time_t in
    compat mode on 64 bit architectures and in native mode on
    32 bit architectures.

    Introduce COMPAT_32BIT_TIME to conditionally compile these
    functions.

    Note that turning off 32 bit time_t support requires more
    changes on architecture side. For instance, architecure
    syscall tables need to be updated to drop support for 32 bit
    time_t syscalls.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Arnd Bergmann

    Deepa Dinamani
     
  • There are a total of 53 system calls (aside from ioctl) that pass a time_t
    or derived data structure as an argument, and in order to extend time_t
    to 64-bit, we have to replace them with new system calls and keep providing
    backwards compatibility.

    To avoid adding completely new and untested code for this purpose, we
    introduce a new CONFIG_64BIT_TIME symbol. Every architecture that supports
    new 64 bit time_t syscalls enables this config.

    After this is done for all architectures, the CONFIG_64BIT_TIME symbol
    will be deleted.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Deepa Dinamani
    Signed-off-by: Arnd Bergmann

    Deepa Dinamani
     

26 Mar, 2018

1 commit


07 Feb, 2018

2 commits

  • Nearly all modern compilers support a stack-protector option, and nearly
    all modern distributions enable the kernel stack-protector, so enabling
    this by default in kernel builds would make sense. However, Kconfig does
    not have knowledge of available compiler features, so it isn't safe to
    force on, as this would unconditionally break builds for the compilers or
    architectures that don't have support. Instead, this introduces a new
    option, CONFIG_CC_STACKPROTECTOR_AUTO, which attempts to discover the best
    possible stack-protector available, and will allow builds to proceed even
    if the compiler doesn't support any stack-protector.

    This option is made the default so that kernels built with modern
    compilers will be protected-by-default against stack buffer overflows,
    avoiding things like the recent BlueBorne attack. Selection of a specific
    stack-protector option remains available, including disabling it.

    Additionally, tiny.config is adjusted to use CC_STACKPROTECTOR_NONE, since
    that's the option with the least code size (and it used to be the default,
    so we have to explicitly choose it there now).

    Link: http://lkml.kernel.org/r/1510076320-69931-4-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Tested-by: Laura Abbott
    Cc: Masahiro Yamada
    Cc: Arnd Bergmann
    Cc: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Various portions of the kernel, especially per-architecture pieces,
    need to know if the compiler is building with the stack protector.
    This was done in the arch/Kconfig with 'select', but this doesn't
    allow a way to do auto-detected compiler support. In preparation for
    creating an on-if-available default, move the logic for the definition of
    CONFIG_CC_STACKPROTECTOR into the Makefile.

    Link: http://lkml.kernel.org/r/1510076320-69931-3-git-send-email-keescook@chromium.org
    Signed-off-by: Kees Cook
    Tested-by: Laura Abbott
    Cc: Masahiro Yamada
    Cc: Arnd Bergmann
    Cc: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

04 Feb, 2018

1 commit

  • Pull hardened usercopy whitelisting from Kees Cook:
    "Currently, hardened usercopy performs dynamic bounds checking on slab
    cache objects. This is good, but still leaves a lot of kernel memory
    available to be copied to/from userspace in the face of bugs.

    To further restrict what memory is available for copying, this creates
    a way to whitelist specific areas of a given slab cache object for
    copying to/from userspace, allowing much finer granularity of access
    control.

    Slab caches that are never exposed to userspace can declare no
    whitelist for their objects, thereby keeping them unavailable to
    userspace via dynamic copy operations. (Note, an implicit form of
    whitelisting is the use of constant sizes in usercopy operations and
    get_user()/put_user(); these bypass all hardened usercopy checks since
    these sizes cannot change at runtime.)

    This new check is WARN-by-default, so any mistakes can be found over
    the next several releases without breaking anyone's system.

    The series has roughly the following sections:
    - remove %p and improve reporting with offset
    - prepare infrastructure and whitelist kmalloc
    - update VFS subsystem with whitelists
    - update SCSI subsystem with whitelists
    - update network subsystem with whitelists
    - update process memory with whitelists
    - update per-architecture thread_struct with whitelists
    - update KVM with whitelists and fix ioctl bug
    - mark all other allocations as not whitelisted
    - update lkdtm for more sensible test overage"

    * tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
    lkdtm: Update usercopy tests for whitelisting
    usercopy: Restrict non-usercopy caches to size 0
    kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
    kvm: whitelist struct kvm_vcpu_arch
    arm: Implement thread_struct whitelist for hardened usercopy
    arm64: Implement thread_struct whitelist for hardened usercopy
    x86: Implement thread_struct whitelist for hardened usercopy
    fork: Provide usercopy whitelisting for task_struct
    fork: Define usercopy region in thread_stack slab caches
    fork: Define usercopy region in mm_struct slab caches
    net: Restrict unwhitelisted proto caches to size 0
    sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
    sctp: Define usercopy region in SCTP proto slab cache
    caif: Define usercopy region in caif proto slab cache
    ip: Define usercopy region in IP proto slab cache
    net: Define usercopy region in struct proto slab cache
    scsi: Define usercopy region in scsi_sense_cache slab cache
    cifs: Define usercopy region in cifs_request slab cache
    vxfs: Define usercopy region in vxfs_inode slab cache
    ufs: Define usercopy region in ufs_inode_cache slab cache
    ...

    Linus Torvalds
     

01 Feb, 2018

2 commits

  • Pull networking updates from David Miller:

    1) Significantly shrink the core networking routing structures. Result
    of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf

    2) Add netdevsim driver for testing various offloads, from Jakub
    Kicinski.

    3) Support cross-chip FDB operations in DSA, from Vivien Didelot.

    4) Add a 2nd listener hash table for TCP, similar to what was done for
    UDP. From Martin KaFai Lau.

    5) Add eBPF based queue selection to tun, from Jason Wang.

    6) Lockless qdisc support, from John Fastabend.

    7) SCTP stream interleave support, from Xin Long.

    8) Smoother TCP receive autotuning, from Eric Dumazet.

    9) Lots of erspan tunneling enhancements, from William Tu.

    10) Add true function call support to BPF, from Alexei Starovoitov.

    11) Add explicit support for GRO HW offloading, from Michael Chan.

    12) Support extack generation in more netlink subsystems. From Alexander
    Aring, Quentin Monnet, and Jakub Kicinski.

    13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
    Russell King.

    14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.

    15) Many improvements and simplifications to the NFP driver bpf JIT,
    from Jakub Kicinski.

    16) Support for ipv6 non-equal cost multipath routing, from Ido
    Schimmel.

    17) Add resource abstration to devlink, from Arkadi Sharshevsky.

    18) Packet scheduler classifier shared filter block support, from Jiri
    Pirko.

    19) Avoid locking in act_csum, from Davide Caratti.

    20) devinet_ioctl() simplifications from Al viro.

    21) More TCP bpf improvements from Lawrence Brakmo.

    22) Add support for onlink ipv6 route flag, similar to ipv4, from David
    Ahern.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
    tls: Add support for encryption using async offload accelerator
    ip6mr: fix stale iterator
    net/sched: kconfig: Remove blank help texts
    openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
    tcp_nv: fix potential integer overflow in tcpnv_acked
    r8169: fix RTL8168EP take too long to complete driver initialization.
    qmi_wwan: Add support for Quectel EP06
    rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
    ipmr: Fix ptrdiff_t print formatting
    ibmvnic: Wait for device response when changing MAC
    qlcnic: fix deadlock bug
    tcp: release sk_frag.page in tcp_disconnect
    ipv4: Get the address of interface correctly.
    net_sched: gen_estimator: fix lockdep splat
    net: macb: Handle HRESP error
    net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
    ipv6: addrconf: break critical section in addrconf_verify_rtnl()
    ipv6: change route cache aging logic
    i40e/i40evf: Update DESC_NEEDED value to reflect larger value
    bnxt_en: cleanup DIM work on device shutdown
    ...

    Linus Torvalds
     
  • Pull dma mapping updates from Christoph Hellwig:
    "Except for a runtime warning fix from Christian this is all about
    consolidation of the generic no-IOMMU code, a well as the glue code
    for swiotlb.

    All the code is based on the x86 implementation with hooks to allow
    all architectures that aren't cache coherent to use it.

    The x86 conversion itself has been deferred because the x86
    maintainers were a little busy in the last months"

    * tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping: (57 commits)
    MAINTAINERS: add the iommu list for swiotlb and xen-swiotlb
    arm64: use swiotlb_alloc and swiotlb_free
    arm64: replace ZONE_DMA with ZONE_DMA32
    mips: use swiotlb_{alloc,free}
    mips/netlogic: remove swiotlb support
    tile: use generic swiotlb_ops
    tile: replace ZONE_DMA with ZONE_DMA32
    unicore32: use generic swiotlb_ops
    ia64: remove an ifdef around the content of pci-dma.c
    ia64: clean up swiotlb support
    ia64: use generic swiotlb_ops
    ia64: replace ZONE_DMA with ZONE_DMA32
    swiotlb: remove various exports
    swiotlb: refactor coherent buffer allocation
    swiotlb: refactor coherent buffer freeing
    swiotlb: wire up ->dma_supported in swiotlb_dma_ops
    swiotlb: add common swiotlb_map_ops
    swiotlb: rename swiotlb_free to swiotlb_exit
    x86: rename swiotlb_dma_ops
    powerpc: rename swiotlb_dma_ops
    ...

    Linus Torvalds
     

16 Jan, 2018

1 commit

  • While the blocked and saved_sigmask fields of task_struct are copied to
    userspace (via sigmask_to_save() and setup_rt_frame()), it is always
    copied with a static length (i.e. sizeof(sigset_t)).

    The only portion of task_struct that is potentially dynamically sized and
    may be copied to userspace is in the architecture-specific thread_struct
    at the end of task_struct.

    cache object allocation:
    kernel/fork.c:
    alloc_task_struct_node(...):
    return kmem_cache_alloc_node(task_struct_cachep, ...);

    dup_task_struct(...):
    ...
    tsk = alloc_task_struct_node(node);

    copy_process(...):
    ...
    dup_task_struct(...)

    _do_fork(...):
    ...
    copy_process(...)

    example usage trace:

    arch/x86/kernel/fpu/signal.c:
    __fpu__restore_sig(...):
    ...
    struct task_struct *tsk = current;
    struct fpu *fpu = &tsk->thread.fpu;
    ...
    __copy_from_user(&fpu->state.xsave, ..., state_size);

    fpu__restore_sig(...):
    ...
    return __fpu__restore_sig(...);

    arch/x86/kernel/signal.c:
    restore_sigcontext(...):
    ...
    fpu__restore_sig(...)

    This introduces arch_thread_struct_whitelist() to let an architecture
    declare specifically where the whitelist should be within thread_struct.
    If undefined, the entire thread_struct field is left whitelisted.

    Cc: Andrew Morton
    Cc: Nicholas Piggin
    Cc: Laura Abbott
    Cc: "Mickaël Salaün"
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Andy Lutomirski
    Signed-off-by: Kees Cook
    Acked-by: Rik van Riel

    Kees Cook
     

13 Jan, 2018

1 commit

  • Since error-injection framework is not limited to be used
    by kprobes, nor bpf. Other kernel subsystems can use it
    freely for checking safeness of error-injection, e.g.
    livepatch, ftrace etc.
    So this separate error-injection framework from kprobes.

    Some differences has been made:

    - "kprobe" word is removed from any APIs/structures.
    - BPF_ALLOW_ERROR_INJECTION() is renamed to
    ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
    - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
    feature. It is automatically enabled if the arch supports
    error injection feature for kprobe or ftrace etc.

    Signed-off-by: Masami Hiramatsu
    Reviewed-by: Josef Bacik
    Signed-off-by: Alexei Starovoitov

    Masami Hiramatsu
     

10 Jan, 2018

1 commit

  • phys_to_dma, dma_to_phys and dma_capable are helpers published by
    architecture code for use of swiotlb and xen-swiotlb only. Drivers are
    not supposed to use these directly, but use the DMA API instead.

    Move these to a new asm/dma-direct.h helper, included by a
    linux/dma-direct.h wrapper that provides the default linear mapping
    unless the architecture wants to override it.

    In the MIPS case the existing dma-coherent.h is reused for now as
    untangling it will take a bit of work.

    Signed-off-by: Christoph Hellwig
    Acked-by: Robin Murphy

    Christoph Hellwig