04 Nov, 2018

1 commit

  • [ Upstream commit 04f264d3a8b0eb25d378127bd78c3c9a0261c828 ]

    We have a need to override the definition of
    barrier_before_unreachable() for MIPS, which means we either need to add
    architecture-specific code into linux/compiler-gcc.h or we need to allow
    the architecture to provide a header that can define the macro before
    the generic definition. The latter seems like the better approach.

    A straightforward approach to the per-arch header is to make use of
    asm-generic to provide a default empty header & adjust architectures
    which don't need anything specific to make use of that by adding the
    header to generic-y. Unfortunately this doesn't work so well due to
    commit 28128c61e08e ("kconfig.h: Include compiler types to avoid missed
    struct attributes") which caused linux/compiler_types.h to be included
    in the compilation of every C file via the -include linux/kconfig.h flag
    in c_flags.

    Because the -include flag is present for all C files we compile, we need
    the architecture-provided header to be present before any C files are
    compiled. If any C files can be compiled prior to the asm-generic header
    wrappers being generated then we hit a build failure due to missing
    header. Such cases do exist - one pointed out by the kbuild test robot
    is the compilation of arch/ia64/kernel/nr-irqs.c, which occurs as part
    of the archprepare target [1].

    This leaves us with a few options:

    1) Use generic-y & fix any build failures we find by enforcing
    ordering such that the asm-generic target occurs before any C
    compilation, such that linux/compiler_types.h can always include
    the generated asm-generic wrapper which in turn includes the empty
    asm-generic header. This would rely on us finding all the
    problematic cases - I don't know for sure that the ia64 issue is
    the only one.

    2) Add an actual empty header to each architecture, so that we don't
    need the generated asm-generic wrapper. This seems messy.

    3) Give up & add #ifdef CONFIG_MIPS or similar to
    linux/compiler_types.h. This seems messy too.

    4) Include the arch header only when it's actually needed, removing
    the need for the asm-generic wrapper for all other architectures.

    This patch allows us to use approach 4, by including an asm/compiler.h
    header from linux/compiler_types.h after the inclusion of the
    compiler-specific linux/compiler-*.h header(s). We do this
    conditionally, only when CONFIG_HAVE_ARCH_COMPILER_H is selected, in
    order to avoid the need for asm-generic wrappers & the associated build
    ordering issue described above. The asm/compiler.h header is included
    after the generic linux/compiler-*.h header(s) for consistency with the
    way linux/compiler-intel.h & linux/compiler-clang.h are included after
    the linux/compiler-gcc.h header that they override.

    [1] https://lists.01.org/pipermail/kbuild-all/2018-August/051175.html

    Signed-off-by: Paul Burton
    Reviewed-by: Masahiro Yamada
    Patchwork: https://patchwork.linux-mips.org/patch/20269/
    Cc: Arnd Bergmann
    Cc: James Hogan
    Cc: Masahiro Yamada
    Cc: Ralf Baechle
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kbuild@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Signed-off-by: Sasha Levin

    Paul Burton
     

05 Sep, 2018

1 commit

  • commit d86564a2f085b79ec046a5cba90188e612352806 upstream.

    Jann reported that x86 was missing required TLB invalidates when he
    hit the !*batch slow path in tlb_remove_table().

    This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache)
    invalidates, the PowerPC-hash where this code originated and the
    Sparc-hash where this was subsequently used did not need that. ARM
    which later used this put an explicit TLB invalidate in their
    __p*_free_tlb() functions, and PowerPC-radix followed that example.

    But when we hooked up x86 we failed to consider this. Fix this by
    (optionally) hooking tlb_remove_table() into the TLB invalidate code.

    NOTE: s390 was also needing something like this and might now
    be able to use the generic code again.

    [ Modified to be on top of Nick's cleanups, which simplified this patch
    now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]

    Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)")
    Reported-by: Jann Horn
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Rik van Riel
    Cc: Nicholas Piggin
    Cc: David Miller
    Cc: Will Deacon
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     

16 Aug, 2018

1 commit

  • commit 05736e4ac13c08a4a9b1ef2de26dd31a32cbee57 upstream

    Provide a command line and a sysfs knob to control SMT.

    The command line options are:

    'nosmt': Enumerate secondary threads, but do not online them

    'nosmt=force': Ignore secondary threads completely during enumeration
    via MP table and ACPI/MADT.

    The sysfs control file has the following states (read/write):

    'on': SMT is enabled. Secondary threads can be freely onlined
    'off': SMT is disabled. Secondary threads, even if enumerated
    cannot be onlined
    'forceoff': SMT is permanentely disabled. Writes to the control
    file are rejected.
    'notsupported': SMT is not supported by the CPU

    The command line option 'nosmt' sets the sysfs control to 'off'. This
    can be changed to 'on' to reenable SMT during runtime.

    The command line option 'nosmt=force' sets the sysfs control to
    'forceoff'. This cannot be changed during runtime.

    When SMT is 'on' and the control file is changed to 'off' then all online
    secondary threads are offlined and attempts to online a secondary thread
    later on are rejected.

    When SMT is 'off' and the control file is changed to 'on' then secondary
    threads can be onlined again. The 'off' -> 'on' transition does not
    automatically online the secondary threads.

    When the control file is set to 'forceoff', the behaviour is the same as
    setting it to 'off', but the operation is irreversible and later writes to
    the control file are rejected.

    When the control status is 'notsupported' then writes to the control file
    are rejected.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Konrad Rzeszutek Wilk
    Acked-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

10 Dec, 2017

1 commit

  • [ Upstream commit a30b85df7d599f626973e9cd3056fe755bd778e0 ]

    We want to wait for all potentially preempted kprobes trampoline
    execution to have completed. This guarantees that any freed
    trampoline memory is not in use by any task in the system anymore.
    synchronize_rcu_tasks() gives such a guarantee, so use it.

    Also, this guarantees to wait for all potentially preempted tasks
    on the instructions which will be replaced with a jump.

    Since this becomes a problem only when CONFIG_PREEMPT=y, enable
    CONFIG_TASKS_RCU=y for synchronize_rcu_tasks() in that case.

    Signed-off-by: Masami Hiramatsu
    Acked-by: Paul E. McKenney
    Cc: Ananth N Mavinakayanahalli
    Cc: Linus Torvalds
    Cc: Naveen N . Rao
    Cc: Paul E . McKenney
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/150845661962.5443.17724352636247312231.stgit@devbox
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

09 Oct, 2017

1 commit

  • The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added
    to indicate that Relaxed Ordering Attributes (RO) should not
    be used for Transaction Layer Packets (TLP) targeted toward
    these affected Root Port, it will clear the bit4 in the PCIe
    Device Control register, so the PCIe device drivers could
    query PCIe configuration space to determine if it can send
    TLPs to Root Port with the Relaxed Ordering Attributes set.

    With this new flag we don't need the config ARCH_WANT_RELAX_ORDER
    to control the Relaxed Ordering Attributes for the ixgbe drivers
    just like the commit 1a8b6d76dc5b ("net:add one common config...") did,
    so revert this commit.

    Signed-off-by: Ding Tianhong
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher

    Ding Tianhong
     

08 Sep, 2017

1 commit

  • Pull gcc plugins update from Kees Cook:
    "This finishes the porting work on randstruct, and introduces a new
    option to structleak, both noted below:

    - For the randstruct plugin, enable automatic randomization of
    structures that are entirely function pointers (along with a couple
    designated initializer fixes).

    - For the structleak plugin, provide an option to perform zeroing
    initialization of all otherwise uninitialized stack variables that
    are passed by reference (Ard Biesheuvel)"

    * tag 'gcc-plugins-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    gcc-plugins: structleak: add option to init all vars used as byref args
    randstruct: Enable function pointer struct detection
    drivers/net/wan/z85230.c: Use designated initializers
    drm/amd/powerplay: rv: Use designated initializers

    Linus Torvalds
     

17 Aug, 2017

1 commit

  • This implements refcount_t overflow protection on x86 without a noticeable
    performance impact, though without the fuller checking of REFCOUNT_FULL.

    This is done by duplicating the existing atomic_t refcount implementation
    but with normally a single instruction added to detect if the refcount
    has gone negative (e.g. wrapped past INT_MAX or below zero). When detected,
    the handler saturates the refcount_t to INT_MIN / 2. With this overflow
    protection, the erroneous reference release that would follow a wrap back
    to zero is blocked from happening, avoiding the class of refcount-overflow
    use-after-free vulnerabilities entirely.

    Only the overflow case of refcounting can be perfectly protected, since
    it can be detected and stopped before the reference is freed and left to
    be abused by an attacker. There isn't a way to block early decrements,
    and while REFCOUNT_FULL stops increment-from-zero cases (which would
    be the state _after_ an early decrement and stops potential double-free
    conditions), this fast implementation does not, since it would require
    the more expensive cmpxchg loops. Since the overflow case is much more
    common (e.g. missing a "put" during an error path), this protection
    provides real-world protection. For example, the two public refcount
    overflow use-after-free exploits published in 2016 would have been
    rendered unexploitable:

    http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/

    http://cyseclabs.com/page?n=02012016

    This implementation does, however, notice an unchecked decrement to zero
    (i.e. caller used refcount_dec() instead of refcount_dec_and_test() and it
    resulted in a zero). Decrements under zero are noticed (since they will
    have resulted in a negative value), though this only indicates that a
    use-after-free may have already happened. Such notifications are likely
    avoidable by an attacker that has already exploited a use-after-free
    vulnerability, but it's better to have them reported than allow such
    conditions to remain universally silent.

    On first overflow detection, the refcount value is reset to INT_MIN / 2
    (which serves as a saturation value) and a report and stack trace are
    produced. When operations detect only negative value results (such as
    changing an already saturated value), saturation still happens but no
    notification is performed (since the value was already saturated).

    On the matter of races, since the entire range beyond INT_MAX but before
    0 is negative, every operation at INT_MIN / 2 will trap, leaving no
    overflow-only race condition.

    As for performance, this implementation adds a single "js" instruction
    to the regular execution flow of a copy of the standard atomic_t refcount
    operations. (The non-"and_test" refcount_dec() function, which is uncommon
    in regular refcount design patterns, has an additional "jz" instruction
    to detect reaching exactly zero.) Since this is a forward jump, it is by
    default the non-predicted path, which will be reinforced by dynamic branch
    prediction. The result is this protection having virtually no measurable
    change in performance over standard atomic_t operations. The error path,
    located in .text.unlikely, saves the refcount location and then uses UD0
    to fire a refcount exception handler, which resets the refcount, handles
    reporting, and returns to regular execution. This keeps the changes to
    .text size minimal, avoiding return jumps and open-coded calls to the
    error reporting routine.

    Example assembly comparison:

    refcount_inc() before:

    .text:
    ffffffff81546149: f0 ff 45 f4 lock incl -0xc(%rbp)

    refcount_inc() after:

    .text:
    ffffffff81546149: f0 ff 45 f4 lock incl -0xc(%rbp)
    ffffffff8154614d: 0f 88 80 d5 17 00 js ffffffff816c36d3
    ...
    .text.unlikely:
    ffffffff816c36d3: 48 8d 4d f4 lea -0xc(%rbp),%rcx
    ffffffff816c36d7: 0f ff (bad)

    These are the cycle counts comparing a loop of refcount_inc() from 1
    to INT_MAX and back down to 0 (via refcount_dec_and_test()), between
    unprotected refcount_t (atomic_t), fully protected REFCOUNT_FULL
    (refcount_t-full), and this overflow-protected refcount (refcount_t-fast):

    2147483646 refcount_inc()s and 2147483647 refcount_dec_and_test()s:
    cycles protections
    atomic_t 82249267387 none
    refcount_t-fast 82211446892 overflow, untested dec-to-zero
    refcount_t-full 144814735193 overflow, untested dec-to-zero, inc-from-zero

    This code is a modified version of the x86 PAX_REFCOUNT atomic_t
    overflow defense from the last public patch of PaX/grsecurity, based
    on my understanding of the code. Changes or omissions from the original
    code are mine and don't reflect the original grsecurity/PaX code. Thanks
    to PaX Team for various suggestions for improvement for repurposing this
    code to be a refcount-only protection.

    Signed-off-by: Kees Cook
    Reviewed-by: Josh Poimboeuf
    Cc: Alexey Dobriyan
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: David S. Miller
    Cc: Davidlohr Bueso
    Cc: Elena Reshetova
    Cc: Eric Biggers
    Cc: Eric W. Biederman
    Cc: Greg KH
    Cc: Hans Liljestrand
    Cc: James Bottomley
    Cc: Jann Horn
    Cc: Linus Torvalds
    Cc: Manfred Spraul
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Serge E. Hallyn
    Cc: Thomas Gleixner
    Cc: arozansk@redhat.com
    Cc: axboe@kernel.dk
    Cc: kernel-hardening@lists.openwall.com
    Cc: linux-arch
    Link: http://lkml.kernel.org/r/20170815161924.GA133115@beast
    Signed-off-by: Ingo Molnar

    Kees Cook
     

08 Aug, 2017

2 commits

  • Kees Cook
     
  • In the Linux kernel, struct type variables are rarely passed by-value,
    and so functions that initialize such variables typically take an input
    reference to the variable rather than returning a value that can
    subsequently be used in an assignment.

    If the initalization function is not part of the same compilation unit,
    the lack of an assignment operation defeats any analysis the compiler
    can perform as to whether the variable may be used before having been
    initialized. This means we may end up passing on such variables
    uninitialized, resulting in potential information leaks.

    So extend the existing structleak GCC plugin so it will [optionally]
    apply to all struct type variables that have their address taken at any
    point, rather than only to variables of struct types that have a __user
    annotation.

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Kees Cook

    Ard Biesheuvel
     

02 Aug, 2017

1 commit


13 Jul, 2017

2 commits

  • This adds support for compiling with a rough equivalent to the glibc
    _FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer
    overflow checks for string.h functions when the compiler determines the
    size of the source or destination buffer at compile-time. Unlike glibc,
    it covers buffer reads in addition to writes.

    GNU C __builtin_*_chk intrinsics are avoided because they would force a
    much more complex implementation. They aren't designed to detect read
    overflows and offer no real benefit when using an implementation based
    on inline checks. Inline checks don't add up to much code size and
    allow full use of the regular string intrinsics while avoiding the need
    for a bunch of _chk functions and per-arch assembly to avoid wrapper
    overhead.

    This detects various overflows at compile-time in various drivers and
    some non-x86 core kernel code. There will likely be issues caught in
    regular use at runtime too.

    Future improvements left out of initial implementation for simplicity,
    as it's all quite optional and can be done incrementally:

    * Some of the fortified string functions (strncpy, strcat), don't yet
    place a limit on reads from the source based on __builtin_object_size of
    the source buffer.

    * Extending coverage to more string functions like strlcat.

    * It should be possible to optionally use __builtin_object_size(x, 1) for
    some functions (C strings) to detect intra-object overflows (like
    glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative
    approach to avoid likely compatibility issues.

    * The compile-time checks should be made available via a separate config
    option which can be enabled by default (or always enabled) once enough
    time has passed to get the issues it catches fixed.

    Kees said:
    "This is great to have. While it was out-of-tree code, it would have
    blocked at least CVE-2016-3858 from being exploitable (improper size
    argument to strlcpy()). I've sent a number of fixes for
    out-of-bounds-reads that this detected upstream already"

    [arnd@arndb.de: x86: fix fortified memcpy]
    Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de
    [keescook@chromium.org: avoid panic() in favor of BUG()]
    Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast
    [keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help]
    Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com
    Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org
    Signed-off-by: Daniel Micay
    Signed-off-by: Kees Cook
    Signed-off-by: Arnd Bergmann
    Acked-by: Kees Cook
    Cc: Mark Rutland
    Cc: Daniel Axtens
    Cc: Rasmus Villemoes
    Cc: Andy Shevchenko
    Cc: Chris Metcalf
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Micay
     
  • Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split
    HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR.

    LOCKUP_DETECTOR implies the general boot, sysctl, and programming
    interfaces for the lockup detectors.

    An architecture that wants to use a hard lockup detector must define
    HAVE_HARDLOCKUP_DETECTOR_PERF or HAVE_HARDLOCKUP_DETECTOR_ARCH.

    Alternatively an arch can define HAVE_NMI_WATCHDOG, which provides the
    minimum arch_touch_nmi_watchdog, and it otherwise does its own thing and
    does not implement the LOCKUP_DETECTOR interfaces.

    sparc is unusual in that it has started to implement some of the
    interfaces, but not fully yet. It should probably be converted to a full
    HAVE_HARDLOCKUP_DETECTOR_ARCH.

    [npiggin@gmail.com: fix]
    Link: http://lkml.kernel.org/r/20170617223522.66c0ad88@roar.ozlabs.ibm.com
    Link: http://lkml.kernel.org/r/20170616065715.18390-4-npiggin@gmail.com
    Signed-off-by: Nicholas Piggin
    Reviewed-by: Don Zickus
    Reviewed-by: Babu Moger
    Tested-by: Babu Moger [sparc]
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicholas Piggin
     

08 Jul, 2017

1 commit

  • …/masahiroy/linux-kbuild

    Pull Kbuild thin archives updates from Masahiro Yamada:
    "Thin archives migration by Nicholas Piggin.

    THIN_ARCHIVES has been available for a while as an optional feature
    only for PowerPC architecture, but we do not need two different
    intermediate-artifact schemes.

    Using thin archives instead of conventional incremental linking has
    various advantages:

    - save disk space for builds

    - speed-up building a little

    - fix some link issues (for example, allyesconfig on ARM) due to more
    flexibility for the final linking

    - work better with dead code elimination we are planning

    As discussed before, this migration has been done unconditionally so
    that any problems caused by this will show up with "git bisect".

    With testing with 0-day and linux-next, some architectures actually
    showed up problems, but they were trivial and all fixed now"

    * tag 'kbuild-thinar-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    tile: remove unneeded extra-y in Makefile
    kbuild: thin archives make default for all archs
    x86/um: thin archives build fix
    tile: thin archives fix linking
    ia64: thin archives fix linking
    sh: thin archives fix linking
    kbuild: handle libs-y archives separately from built-in.o archives
    kbuild: thin archives use P option to ar
    kbuild: thin archives final link close --whole-archives option
    ia64: remove unneeded extra-y in Makefile.gate
    tile: fix dependency and .*.cmd inclusion for incremental build
    sparc64: Use indirect calls in hamming weight stubs

    Linus Torvalds
     

06 Jul, 2017

1 commit

  • Pull GCC plugin updates from Kees Cook:
    "The big part is the randstruct plugin infrastructure.

    This is the first of two expected pull requests for randstruct since
    there are dependencies in other trees that would be easier to merge
    once those have landed. Notably, the IPC allocation refactoring in
    -mm, and many trivial merge conflicts across several trees when
    applying the __randomize_layout annotation.

    As a result, it seemed like I should send this now since it is
    relatively self-contained, and once the rest of the trees have landed,
    send the annotation patches. I'm expecting the final phase of
    randstruct (automatic struct selection) will land for v4.14, but if
    its other tree dependencies actually make it for v4.13, I can send
    that merge request too.

    Summary:

    - typo fix in Kconfig (Jean Delvare)

    - randstruct infrastructure"

    * tag 'gcc-plugins-v4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    ARM: Prepare for randomized task_struct
    randstruct: Whitelist NIU struct page overloading
    randstruct: Whitelist big_key path struct overloading
    randstruct: Whitelist UNIXCB cast
    randstruct: Whitelist struct security_hook_heads cast
    gcc-plugins: Add the randstruct plugin
    Fix English in description of GCC_PLUGIN_STRUCTLEAK
    compiler: Add __designated_init annotation
    gcc-plugins: Detail c-common.h location for GCC 4.6

    Linus Torvalds
     

05 Jul, 2017

1 commit


30 Jun, 2017

1 commit


29 Jun, 2017

1 commit

  • Many subsystems will not use refcount_t unless there is a way to build the
    kernel so that there is no regression in speed compared to atomic_t. This
    adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation
    which has the validation but is slightly slower. When not enabled,
    refcount_t uses the basic unchecked atomic_t routines, which results in
    no code changes compared to just using atomic_t directly.

    Signed-off-by: Kees Cook
    Acked-by: Greg Kroah-Hartman
    Cc: Alexey Dobriyan
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: David S. Miller
    Cc: David Windsor
    Cc: Davidlohr Bueso
    Cc: Elena Reshetova
    Cc: Eric Biggers
    Cc: Eric W. Biederman
    Cc: Hans Liljestrand
    Cc: James Bottomley
    Cc: Jann Horn
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Manfred Spraul
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Serge E. Hallyn
    Cc: Thomas Gleixner
    Cc: arozansk@redhat.com
    Cc: axboe@kernel.dk
    Cc: linux-arch
    Link: http://lkml.kernel.org/r/20170621200026.GA115679@beast
    Signed-off-by: Ingo Molnar

    Kees Cook
     

23 Jun, 2017

1 commit

  • This randstruct plugin is modified from Brad Spengler/PaX Team's code
    in the last public patch of grsecurity/PaX based on my understanding
    of the code. Changes or omissions from the original code are mine and
    don't reflect the original grsecurity/PaX code.

    The randstruct GCC plugin randomizes the layout of selected structures
    at compile time, as a probabilistic defense against attacks that need to
    know the layout of structures within the kernel. This is most useful for
    "in-house" kernel builds where neither the randomization seed nor other
    build artifacts are made available to an attacker. While less useful for
    distribution kernels (where the randomization seed must be exposed for
    third party kernel module builds), it still has some value there since now
    all kernel builds would need to be tracked by an attacker.

    In more performance sensitive scenarios, GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
    can be selected to make a best effort to restrict randomization to
    cacheline-sized groups of elements, and will not randomize bitfields. This
    comes at the cost of reduced randomization.

    Two annotations are defined,__randomize_layout and __no_randomize_layout,
    which respectively tell the plugin to either randomize or not to
    randomize instances of the struct in question. Follow-on patches enable
    the auto-detection logic for selecting structures for randomization
    that contain only function pointers. It is disabled here to assist with
    bisection.

    Since any randomized structs must be initialized using designated
    initializers, __randomize_layout includes the __designated_init annotation
    even when the plugin is disabled so that all builds will require
    the needed initialization. (With the plugin enabled, annotations for
    automatically chosen structures are marked as well.)

    The main differences between this implemenation and grsecurity are:
    - disable automatic struct selection (to be enabled in follow-up patch)
    - add designated_init attribute at runtime and for manual marking
    - clarify debugging output to differentiate bad cast warnings
    - add whitelisting infrastructure
    - support gcc 7's DECL_ALIGN and DECL_MODE changes (Laura Abbott)
    - raise minimum required GCC version to 4.7

    Earlier versions of this patch series were ported by Michael Leibowitz.

    Signed-off-by: Kees Cook

    Kees Cook
     

20 Jun, 2017

1 commit


11 May, 2017

1 commit

  • Pull RCU updates from Ingo Molnar:
    "The main changes are:

    - Debloat RCU headers

    - Parallelize SRCU callback handling (plus overlapping patches)

    - Improve the performance of Tree SRCU on a CPU-hotplug stress test

    - Documentation updates

    - Miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
    rcu: Open-code the rcu_cblist_n_lazy_cbs() function
    rcu: Open-code the rcu_cblist_n_cbs() function
    rcu: Open-code the rcu_cblist_empty() function
    rcu: Separately compile large rcu_segcblist functions
    srcu: Debloat the header
    srcu: Adjust default auto-expediting holdoff
    srcu: Specify auto-expedite holdoff time
    srcu: Expedite first synchronize_srcu() when idle
    srcu: Expedited grace periods with reduced memory contention
    srcu: Make rcutorture writer stalls print SRCU GP state
    srcu: Exact tracking of srcu_data structures containing callbacks
    srcu: Make SRCU be built by default
    srcu: Fix Kconfig botch when SRCU not selected
    rcu: Make non-preemptive schedule be Tasks RCU quiescent state
    srcu: Expedite srcu_schedule_cbs_snp() callback invocation
    srcu: Parallelize callback handling
    kvm: Move srcu_struct fields to end of struct kvm
    rcu: Fix typo in PER_RCU_NODE_PERIOD header comment
    rcu: Use true/false in assignment to bool
    rcu: Use bool value directly
    ...

    Linus Torvalds
     

09 May, 2017

1 commit

  • Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
    reuse crashkernel parameter for fadump", v4.

    Traditionally, kdump is used to save vmcore in case of a crash. Some
    architectures like powerpc can save vmcore using architecture specific
    support instead of kexec/kdump mechanism. Such architecture specific
    support also needs to reserve memory, to be used by dump capture kernel.
    crashkernel parameter can be a reused, for memory reservation, by such
    architecture specific infrastructure.

    This patchset removes dependency with CONFIG_KEXEC for crashkernel
    parameter and vmcoreinfo related code as it can be reused without kexec
    support. Also, crashkernel parameter is reused instead of
    fadump_reserve_mem to reserve memory for fadump.

    The first patch moves crashkernel parameter parsing and vmcoreinfo
    related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE. The
    second patch reuses the definitions of append_elf_note() & final_note()
    functions under CONFIG_CRASH_CORE in IA64 arch code. The third patch
    removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
    in powerpc. The next patch reuses crashkernel parameter for reserving
    memory for fadump, instead of the fadump_reserve_mem parameter. This
    has the advantage of using all syntaxes crashkernel parameter supports,
    for fadump as well. The last patch updates fadump kernel documentation
    about use of crashkernel parameter.

    This patch (of 5):

    Traditionally, kdump is used to save vmcore in case of a crash. Some
    architectures like powerpc can save vmcore using architecture specific
    support instead of kexec/kdump mechanism. Such architecture specific
    support also needs to reserve memory, to be used by dump capture kernel.
    crashkernel parameter can be a reused, for memory reservation, by such
    architecture specific infrastructure.

    But currently, code related to vmcoreinfo and parsing of crashkernel
    parameter is built under CONFIG_KEXEC_CORE. This patch introduces
    CONFIG_CRASH_CORE and moves the above mentioned code under this config,
    allowing code reuse without dependency on CONFIG_KEXEC. There is no
    functional change with this patch.

    Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.com
    Signed-off-by: Hari Bathini
    Acked-by: Dave Young
    Cc: Fenghua Yu
    Cc: Tony Luck
    Cc: Eric Biederman
    Cc: Mahesh Salgaonkar
    Cc: Vivek Goyal
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hari Bathini
     

03 May, 2017

1 commit

  • Pull livepatch updates from Jiri Kosina:

    - a per-task consistency model is being added for architectures that
    support reliable stack dumping (extending this, currently rather
    trivial set, is currently in the works).

    This extends the nature of the types of patches that can be applied
    by live patching infrastructure. The code stems from the design
    proposal made [1] back in November 2014. It's a hybrid of SUSE's
    kGraft and RH's kpatch, combining advantages of both: it uses
    kGraft's per-task consistency and syscall barrier switching combined
    with kpatch's stack trace switching. There are also a number of
    fallback options which make it quite flexible.

    Most of the heavy lifting done by Josh Poimboeuf with help from
    Miroslav Benes and Petr Mladek

    [1] https://lkml.kernel.org/r/20141107140458.GA21774@suse.cz

    - module load time patch optimization from Zhou Chengming

    - a few assorted small fixes

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
    livepatch: add missing printk newlines
    livepatch: Cancel transition a safe way for immediate patches
    livepatch: Reduce the time of finding module symbols
    livepatch: make klp_mutex proper part of API
    livepatch: allow removal of a disabled patch
    livepatch: add /proc//patch_state
    livepatch: change to a per-task consistency model
    livepatch: store function sizes
    livepatch: use kstrtobool() in enabled_store()
    livepatch: move patching functions into patch.c
    livepatch: remove unnecessary object loaded check
    livepatch: separate enabled and patched states
    livepatch/s390: add TIF_PATCH_PENDING thread flag
    livepatch/s390: reorganize TIF thread flag bits
    livepatch/powerpc: add TIF_PATCH_PENDING thread flag
    livepatch/x86: add TIF_PATCH_PENDING thread flag
    livepatch: create temporary klp_update_patch_state() stub
    x86/entry: define _TIF_ALLWORK_MASK flags explicitly
    stacktrace/x86: add function for detecting reliable stack traces

    Linus Torvalds
     

19 Apr, 2017

1 commit

  • The definition of smp_mb__after_unlock_lock() is currently smp_mb()
    for CONFIG_PPC and a no-op otherwise. It would be better to instead
    provide an architecture-selectable Kconfig option, and select the
    strength of smp_mb__after_unlock_lock() based on that option. This
    commit therefore creates ARCH_WEAK_RELEASE_ACQUIRE, has PPC select it,
    and bases the definition of smp_mb__after_unlock_lock() on this new
    ARCH_WEAK_RELEASE_ACQUIRE Kconfig option.

    Reported-by: Ingo Molnar
    Signed-off-by: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: Boqun Feng
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Acked-by: Michael Ellerman
    Cc:
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

13 Mar, 2017

1 commit

  • mmap() uses a base address, from which it starts to look for a free space
    for allocation.

    The base address is stored in mm->mmap_base, which is calculated during
    exec(). The address depends on task's size, set rlimit for stack, ASLR
    randomization. The base depends on the task size and the number of random
    bits which are different for 64-bit and 32bit applications.

    Due to the fact, that the base address is fixed, its mmap() from a compat
    (32bit) syscall issued by a 64bit task will return a address which is based
    on the 64bit base address and does not fit into the 32bit address space
    (4GB). The returned pointer is truncated to 32bit, which results in an
    invalid address.

    To solve store a seperate compat address base plus a compat legacy address
    base in mm_struct. These bases are calculated at exec() time and can be
    used later to address the 32bit compat mmap() issued by 64 bit
    applications.

    As a consequence of this change 32-bit applications issuing a 64-bit
    syscall (after doing a long jump) will get a 64-bit mapping now. Before
    this change 32-bit applications always got a 32bit mapping.

    [ tglx: Massaged changelog and added a comment ]

    Signed-off-by: Dmitry Safonov
    Cc: 0x7f454c46@gmail.com
    Cc: linux-mm@kvack.org
    Cc: Andy Lutomirski
    Cc: Cyrill Gorcunov
    Cc: Borislav Petkov
    Cc: "Kirill A. Shutemov"
    Link: http://lkml.kernel.org/r/20170306141721.9188-4-dsafonov@virtuozzo.com
    Signed-off-by: Thomas Gleixner

    Dmitry Safonov
     

08 Mar, 2017

1 commit

  • For live patching and possibly other use cases, a stack trace is only
    useful if it can be assured that it's completely reliable. Add a new
    save_stack_trace_tsk_reliable() function to achieve that.

    Note that if the target task isn't the current task, and the target task
    is allowed to run, then it could be writing the stack while the unwinder
    is reading it, resulting in possible corruption. So the caller of
    save_stack_trace_tsk_reliable() must ensure that the task is either
    'current' or inactive.

    save_stack_trace_tsk_reliable() relies on the x86 unwinder's detection
    of pt_regs on the stack. If the pt_regs are not user-mode registers
    from a syscall, then they indicate an in-kernel interrupt or exception
    (e.g. preemption or a page fault), in which case the stack is considered
    unreliable due to the nature of frame pointers.

    It also relies on the x86 unwinder's detection of other issues, such as:

    - corrupted stack data
    - stack grows the wrong way
    - stack walk doesn't reach the bottom
    - user didn't provide a large enough entries array

    Such issues are reported by checking unwind_error() and !unwind_done().

    Also add CONFIG_HAVE_RELIABLE_STACKTRACE so arch-independent code can
    determine at build time whether the function is implemented.

    Signed-off-by: Josh Poimboeuf
    Reviewed-by: Miroslav Benes
    Acked-by: Ingo Molnar # for the x86 changes
    Signed-off-by: Jiri Kosina

    Josh Poimboeuf
     

28 Feb, 2017

1 commit

  • Fix typos and add the following to the scripts/spelling.txt:

    an user||a user
    an userspace||a userspace

    I also added "userspace" to the list since it is a common word in Linux.
    I found some instances for "an userfaultfd", but I did not add it to the
    list. I felt it is endless to find words that start with "user" such as
    "userland" etc., so must draw a line somewhere.

    Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     

25 Feb, 2017

1 commit

  • The current transparent hugepage code only supports PMDs. This patch
    adds support for transparent use of PUDs with DAX. It does not include
    support for anonymous pages. x86 support code also added.

    Most of this patch simply parallels the work that was done for huge
    PMDs. The only major difference is how the new ->pud_entry method in
    mm_walk works. The ->pmd_entry method replaces the ->pte_entry method,
    whereas the ->pud_entry method works along with either ->pmd_entry or
    ->pte_entry. The pagewalk code takes care of locking the PUD before
    calling ->pud_walk, so handlers do not need to worry whether the PUD is
    stable.

    [dave.jiang@intel.com: fix SMP x86 32bit build for native_pud_clear()]
    Link: http://lkml.kernel.org/r/148719066814.31111.3239231168815337012.stgit@djiang5-desk3.ch.intel.com
    [dave.jiang@intel.com: native_pud_clear missing on i386 build]
    Link: http://lkml.kernel.org/r/148640375195.69754.3315433724330910314.stgit@djiang5-desk3.ch.intel.com
    Link: http://lkml.kernel.org/r/148545059381.17912.8602162635537598445.stgit@djiang5-desk3.ch.intel.com
    Signed-off-by: Matthew Wilcox
    Signed-off-by: Dave Jiang
    Tested-by: Alexander Kapshuk
    Cc: Dave Hansen
    Cc: Vlastimil Babka
    Cc: Jan Kara
    Cc: Dan Williams
    Cc: Ross Zwisler
    Cc: Kirill A. Shutemov
    Cc: Nilesh Choudhury
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

23 Feb, 2017

2 commits

  • Pull networking updates from David Miller:
    "Highlights:

    1) Support TX_RING in AF_PACKET TPACKET_V3 mode, from Sowmini
    Varadhan.

    2) Simplify classifier state on sk_buff in order to shrink it a bit.
    From Willem de Bruijn.

    3) Introduce SIPHASH and it's usage for secure sequence numbers and
    syncookies. From Jason A. Donenfeld.

    4) Reduce CPU usage for ICMP replies we are going to limit or
    suppress, from Jesper Dangaard Brouer.

    5) Introduce Shared Memory Communications socket layer, from Ursula
    Braun.

    6) Add RACK loss detection and allow it to actually trigger fast
    recovery instead of just assisting after other algorithms have
    triggered it. From Yuchung Cheng.

    7) Add xmit_more and BQL support to mvneta driver, from Simon Guinot.

    8) skb_cow_data avoidance in esp4 and esp6, from Steffen Klassert.

    9) Export MPLS packet stats via netlink, from Robert Shearman.

    10) Significantly improve inet port bind conflict handling, especially
    when an application is restarted and changes it's setting of
    reuseport. From Josef Bacik.

    11) Implement TX batching in vhost_net, from Jason Wang.

    12) Extend the dummy device so that VF (virtual function) features,
    such as configuration, can be more easily tested. From Phil
    Sutter.

    13) Avoid two atomic ops per page on x86 in bnx2x driver, from Eric
    Dumazet.

    14) Add new bpf MAP, implementing a longest prefix match trie. From
    Daniel Mack.

    15) Packet sample offloading support in mlxsw driver, from Yotam Gigi.

    16) Add new aquantia driver, from David VomLehn.

    17) Add bpf tracepoints, from Daniel Borkmann.

    18) Add support for port mirroring to b53 and bcm_sf2 drivers, from
    Florian Fainelli.

    19) Remove custom busy polling in many drivers, it is done in the core
    networking since 4.5 times. From Eric Dumazet.

    20) Support XDP adjust_head in virtio_net, from John Fastabend.

    21) Fix several major holes in neighbour entry confirmation, from
    Julian Anastasov.

    22) Add XDP support to bnxt_en driver, from Michael Chan.

    23) VXLAN offloads for enic driver, from Govindarajulu Varadarajan.

    24) Add IPVTAP driver (IP-VLAN based tap driver) from Sainath Grandhi.

    25) Support GRO in IPSEC protocols, from Steffen Klassert"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1764 commits)
    Revert "ath10k: Search SMBIOS for OEM board file extension"
    net: socket: fix recvmmsg not returning error from sock_error
    bnxt_en: use eth_hw_addr_random()
    bpf: fix unlocking of jited image when module ronx not set
    arch: add ARCH_HAS_SET_MEMORY config
    net: napi_watchdog() can use napi_schedule_irqoff()
    tcp: Revert "tcp: tcp_probe: use spin_lock_bh()"
    net/hsr: use eth_hw_addr_random()
    net: mvpp2: enable building on 64-bit platforms
    net: mvpp2: switch to build_skb() in the RX path
    net: mvpp2: simplify MVPP2_PRS_RI_* definitions
    net: mvpp2: fix indentation of MVPP2_EXT_GLOBAL_CTRL_DEFAULT
    net: mvpp2: remove unused register definitions
    net: mvpp2: simplify mvpp2_bm_bufs_add()
    net: mvpp2: drop useless fields in mvpp2_bm_pool and related code
    net: mvpp2: remove unused 'tx_skb' field of 'struct mvpp2_tx_queue'
    net: mvpp2: release reference to txq_cpu[] entry after unmapping
    net: mvpp2: handle too large value in mvpp2_rx_time_coal_set()
    net: mvpp2: handle too large value handling in mvpp2_rx_pkts_coal_set()
    net: mvpp2: remove useless arguments in mvpp2_rx_{pkts, time}_coal_set
    ...

    Linus Torvalds
     
  • Pull gcc-plugins updates from Kees Cook:
    "This includes infrastructure updates and the structleak plugin, which
    performs forced initialization of certain structures to avoid possible
    information exposures to userspace.

    Summary:

    - infrastructure updates (gcc-common.h)

    - introduce structleak plugin for forced initialization of some
    structures"

    * tag 'gcc-plugins-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    gcc-plugins: Add structleak for more stack initialization
    gcc-plugins: consolidate on PASS_INFO macro
    gcc-plugins: add PASS_INFO and build_const_char_string()

    Linus Torvalds
     

22 Feb, 2017

1 commit

  • Currently, there's no good way to test for the presence of
    set_memory_ro/rw/x/nx() helpers implemented by archs such as
    x86, arm, arm64 and s390.

    There's DEBUG_SET_MODULE_RONX and DEBUG_RODATA, however both
    don't really reflect that: set_memory_*() are also available
    even when DEBUG_SET_MODULE_RONX is turned off, and DEBUG_RODATA
    is set by parisc, but doesn't implement above functions. Thus,
    add ARCH_HAS_SET_MEMORY that is selected by mentioned archs,
    where generic code can test against this.

    This also allows later on to move DEBUG_SET_MODULE_RONX out of
    the arch specific Kconfig to define it only once depending on
    ARCH_HAS_SET_MEMORY.

    Suggested-by: Laura Abbott
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

08 Feb, 2017

2 commits

  • Both of these options are poorly named. The features they provide are
    necessary for system security and should not be considered debug only.
    Change the names to CONFIG_STRICT_KERNEL_RWX and
    CONFIG_STRICT_MODULE_RWX to better describe what these options do.

    Signed-off-by: Laura Abbott
    Acked-by: Jessica Yu
    Signed-off-by: Kees Cook

    Laura Abbott
     
  • There are multiple architectures that support CONFIG_DEBUG_RODATA and
    CONFIG_SET_MODULE_RONX. These options also now have the ability to be
    turned off at runtime. Move these to an architecture independent
    location and make these options def_bool y for almost all of those
    arches.

    Signed-off-by: Laura Abbott
    Acked-by: Ingo Molnar
    Acked-by: Heiko Carstens
    Signed-off-by: Kees Cook

    Laura Abbott
     

19 Jan, 2017

2 commits

  • Relax ordering(RO) is one feature of 82599 NIC, to enable this feature can
    enhance the performance for some cpu architecure, such as SPARC and so on.
    Currently it only supports one special cpu architecture(SPARC) in 82599
    driver to enable RO feature, this is not very common for other cpu architecture
    which really needs RO feature.
    This patch add one common config CONFIG_ARCH_WANT_RELAX_ORDER to set RO feature,
    and should define CONFIG_ARCH_WANT_RELAX_ORDER in sparc Kconfig firstly.

    Signed-off-by: Mao Wenan
    Reviewed-by: Alexander Duyck
    Reviewed-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Mao Wenan
     
  • This plugin detects any structures that contain __user attributes and
    makes sure it is being fully initialized so that a specific class of
    information exposure is eliminated. (This plugin was originally designed
    to block the exposure of siginfo in CVE-2013-2141.)

    Ported from grsecurity/PaX. This version adds a verbose option to the
    plugin and the Kconfig.

    Signed-off-by: Kees Cook

    Kees Cook
     

21 Dec, 2016

1 commit

  • Patch series "ima: carry the measurement list across kexec", v8.

    The TPM PCRs are only reset on a hard reboot. In order to validate a
    TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
    list of the running kernel must be saved and then restored on the
    subsequent boot, possibly of a different architecture.

    The existing securityfs binary_runtime_measurements file conveniently
    provides a serialized format of the IMA measurement list. This patch
    set serializes the measurement list in this format and restores it.

    Up to now, the binary_runtime_measurements was defined as architecture
    native format. The assumption being that userspace could and would
    handle any architecture conversions. With the ability of carrying the
    measurement list across kexec, possibly from one architecture to a
    different one, the per boot architecture information is lost and with it
    the ability of recalculating the template digest hash. To resolve this
    problem, without breaking the existing ABI, this patch set introduces
    the boot command line option "ima_canonical_fmt", which is arbitrarily
    defined as little endian.

    The need for this boot command line option will be limited to the
    existing version 1 format of the binary_runtime_measurements.
    Subsequent formats will be defined as canonical format (eg. TPM 2.0
    support for larger digests).

    A simplified method of Thiago Bauermann's "kexec buffer handover" patch
    series for carrying the IMA measurement list across kexec is included in
    this patch set. The simplified method requires all file measurements be
    taken prior to executing the kexec load, as subsequent measurements will
    not be carried across the kexec and restored.

    This patch (of 10):

    The IMA kexec buffer allows the currently running kernel to pass the
    measurement list via a kexec segment to the kernel that will be kexec'd.
    The second kernel can check whether the previous kernel sent the buffer
    and retrieve it.

    This is the architecture-specific part which enables IMA to receive the
    measurement list passed by the previous kernel. It will be used in the
    next patch.

    The change in machine_kexec_64.c is to factor out the logic of removing
    an FDT memory reservation so that it can be used by remove_ima_buffer.

    Link: http://lkml.kernel.org/r/1480554346-29071-2-git-send-email-zohar@linux.vnet.ibm.com
    Signed-off-by: Thiago Jung Bauermann
    Signed-off-by: Mimi Zohar
    Acked-by: "Eric W. Biederman"
    Cc: Andreas Steffen
    Cc: Dmitry Kasatkin
    Cc: Josh Sklar
    Cc: Dave Young
    Cc: Vivek Goyal
    Cc: Baoquan He
    Cc: Michael Ellerman
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Stewart Smith
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thiago Jung Bauermann
     

14 Dec, 2016

1 commit


13 Dec, 2016

1 commit

  • Pull scheduler updates from Ingo Molnar:
    "The main scheduler changes in this cycle were:

    - support Intel Turbo Boost Max Technology 3.0 (TBM3) by introducig a
    notion of 'better cores', which the scheduler will prefer to
    schedule single threaded workloads on. (Tim Chen, Srinivas
    Pandruvada)

    - enhance the handling of asymmetric capacity CPUs further (Morten
    Rasmussen)

    - improve/fix load handling when moving tasks between task groups
    (Vincent Guittot)

    - simplify and clean up the cputime code (Stanislaw Gruszka)

    - improve mass fork()ed task spread a.k.a. hackbench speedup (Vincent
    Guittot)

    - make struct kthread kmalloc()ed and related fixes (Oleg Nesterov)

    - add uaccess atomicity debugging (when using access_ok() in the
    wrong context), under CONFIG_DEBUG_ATOMIC_SLEEP=y (Peter Zijlstra)

    - implement various fixes, cleanups and other enhancements (Daniel
    Bristot de Oliveira, Martin Schwidefsky, Rafael J. Wysocki)"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
    sched/core: Use load_avg for selecting idlest group
    sched/core: Fix find_idlest_group() for fork
    kthread: Don't abuse kthread_create_on_cpu() in __kthread_create_worker()
    kthread: Don't use to_live_kthread() in kthread_[un]park()
    kthread: Don't use to_live_kthread() in kthread_stop()
    Revert "kthread: Pin the stack via try_get_task_stack()/put_task_stack() in to_live_kthread() function"
    kthread: Make struct kthread kmalloc'ed
    x86/uaccess, sched/preempt: Verify access_ok() context
    sched/x86: Make CONFIG_SCHED_MC_PRIO=y easier to enable
    sched/x86: Change CONFIG_SCHED_ITMT to CONFIG_SCHED_MC_PRIO
    x86/sched: Use #include instead of #include
    cpufreq/intel_pstate: Use CPPC to get max performance
    acpi/bus: Set _OSC for diverse core support
    acpi/bus: Enable HWP CPPC objects
    x86/sched: Add SD_ASYM_PACKING flags to x86 ITMT CPU
    x86/sysctl: Add sysctl for ITMT scheduling feature
    x86: Enable Intel Turbo Boost Max Technology 3.0
    x86/topology: Define x86's arch_update_cpu_topology
    sched: Extend scheduler's asym packing
    sched/fair: Clean up the tunable parameter definitions
    ...

    Linus Torvalds
     

12 Dec, 2016

1 commit


15 Nov, 2016

1 commit

  • Only s390 and powerpc have hardware facilities allowing to measure
    cputimes scaled by frequency. On all other architectures
    utimescaled/stimescaled are equal to utime/stime (however they are
    accounted separately).

    Remove {u,s}timescaled accounting on all architectures except
    powerpc and s390, where those values are explicitly accounted
    in the proper places.

    Signed-off-by: Stanislaw Gruszka
    Signed-off-by: Frederic Weisbecker
    Cc: Benjamin Herrenschmidt
    Cc: Heiko Carstens
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Michael Neuling
    Cc: Paul Mackerras
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20161031162143.GB12646@redhat.com
    Signed-off-by: Ingo Molnar

    Stanislaw Gruszka