31 Jan, 2018

1 commit

  • [ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ]

    The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

    A quote from goolge project zero blog:
    "At this point, it would normally be necessary to locate gadgets in
    the host kernel code that can be used to actually leak data by reading
    from an attacker-controlled location, shifting and masking the result
    appropriately and then using the result of that as offset to an
    attacker-controlled address for a load. But piecing gadgets together
    and figuring out which ones work in a speculation context seems annoying.
    So instead, we decided to use the eBPF interpreter, which is built into
    the host kernel - while there is no legitimate way to invoke it from inside
    a VM, the presence of the code in the host kernel's text section is sufficient
    to make it usable for the attack, just like with ordinary ROP gadgets."

    To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
    option that removes interpreter from the kernel in favor of JIT-only mode.
    So far eBPF JIT is supported by:
    x64, arm64, arm32, sparc64, s390, powerpc64, mips64

    The start of JITed program is randomized and code page is marked as read-only.
    In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

    v2->v3:
    - move __bpf_prog_ret0 under ifdef (Daniel)

    v1->v2:
    - fix init order, test_bpf and cBPF (Daniel's feedback)
    - fix offloaded bpf (Jakub's feedback)
    - add 'return 0' dummy in case something can invoke prog->bpf_func
    - retarget bpf tree. For bpf-next the patch would need one extra hunk.
    It will be sent when the trees are merged back to net-next

    Considered doing:
    int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
    but it seems better to land the patch as-is and in bpf-next remove
    bpf_jit_enable global variable from all JITs, consolidate in one place
    and remove this jit_init() function.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Greg Kroah-Hartman

    Alexei Starovoitov
     

07 Oct, 2017

1 commit

  • The choice containing the CC_OPTIMIZE_FOR_PERFORMANCE symbol
    accidentally added a "CONFIG_" prefix when trying to make it the
    default, selecting an undefined symbol as the default.

    The mistake is harmless here: Since the default symbol is not visible,
    the choice falls back on using the visible symbol as the default
    instead, which is CC_OPTIMIZE_FOR_PERFORMANCE, as intended.

    A patch that makes Kconfig print a warning in this case has been
    submitted separately:
    http://www.spinics.net/lists/linux-kbuild/msg15566.html

    Signed-off-by: Ulf Magnusson
    Acked-by: Arnd Bergmann
    Signed-off-by: Masahiro Yamada

    Ulf Magnusson
     

07 Sep, 2017

1 commit

  • This SLUB free list pointer obfuscation code is modified from Brad
    Spengler/PaX Team's code in the last public patch of grsecurity/PaX
    based on my understanding of the code. Changes or omissions from the
    original code are mine and don't reflect the original grsecurity/PaX
    code.

    This adds a per-cache random value to SLUB caches that is XORed with
    their freelist pointer address and value. This adds nearly zero
    overhead and frustrates the very common heap overflow exploitation
    method of overwriting freelist pointers.

    A recent example of the attack is written up here:

    http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit

    and there is a section dedicated to the technique the book "A Guide to
    Kernel Exploitation: Attacking the Core".

    This is based on patches by Daniel Micay, and refactored to minimize the
    use of #ifdef.

    With 200-count cycles of "hackbench -g 20 -l 1000" I saw the following
    run times:

    before:
    mean 10.11882499999999999995
    variance .03320378329145728642
    stdev .18221905304181911048

    after:
    mean 10.12654000000000000014
    variance .04700556623115577889
    stdev .21680767106160192064

    The difference gets lost in the noise, but if the above is to be taken
    literally, using CONFIG_FREELIST_HARDENED is 0.07% slower.

    Link: http://lkml.kernel.org/r/20170802180609.GA66807@beast
    Signed-off-by: Kees Cook
    Suggested-by: Daniel Micay
    Cc: Rik van Riel
    Cc: Tycho Andersen
    Cc: Alexander Popov
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

01 Aug, 2017

1 commit

  • This makes it possible to preserve basic futex support and compile out the
    PI support when RT mutexes are not available.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Darren Hart
    Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1708010024190.5981@knanqh.ubzr

    Nicolas Pitre
     

07 Jul, 2017

2 commits

  • Some hardened environments want to build kernels with slab_nomerge
    already set (so that they do not depend on remembering to set the kernel
    command line option). This is desired to reduce the risk of kernel heap
    overflows being able to overwrite objects from merged caches and changes
    the requirements for cache layout control, increasing the difficulty of
    these attacks. By keeping caches unmerged, these kinds of exploits can
    usually only damage objects in the same cache (though the risk to
    metadata exploitation is unchanged).

    Link: http://lkml.kernel.org/r/20170620230911.GA25238@beast
    Signed-off-by: Kees Cook
    Cc: Daniel Micay
    Cc: David Windsor
    Cc: Eric Biggers
    Cc: Christoph Lameter
    Cc: Jonathan Corbet
    Cc: Daniel Micay
    Cc: David Windsor
    Cc: Eric Biggers
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: "Rafael J. Wysocki"
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Mauro Carvalho Chehab
    Cc: "Paul E. McKenney"
    Cc: Arnd Bergmann
    Cc: Andy Lutomirski
    Cc: Nicolas Pitre
    Cc: Tejun Heo
    Cc: Daniel Mack
    Cc: Sebastian Andrzej Siewior
    Cc: Sergey Senozhatsky
    Cc: Helge Deller
    Cc: Rik van Riel
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Pull cgroup changes from Tejun Heo:

    - Waiman made the debug controller work and a lot more useful on
    cgroup2

    - There were a couple issues with cgroup subtree delegation. The
    documentation on delegating to a non-root user was missing some part
    and cgroup namespace support wasn't factoring in delegation at all.
    The documentation is updated and the now there is a mount option to
    make cgroup namespace fit for delegation

    * 'for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    cgroup: implement "nsdelegate" mount option
    cgroup: restructure cgroup_procs_write_permission()
    cgroup: "cgroup.subtree_control" should be writeable by delegatee
    cgroup: fix lockdep warning in debug controller
    cgroup: refactor cgroup_masks_read() in the debug controller
    cgroup: make debug an implicit controller on cgroup2
    cgroup: Make debug cgroup support v2 and thread mode
    cgroup: Make Kconfig prompt of debug cgroup more accurate
    cgroup: Move debug cgroup to its own file
    cgroup: Keep accurate count of tasks in each css_set

    Linus Torvalds
     

04 Jul, 2017

1 commit

  • Pull scheduler updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Add the SYSTEM_SCHEDULING bootup state to move various scheduler
    debug checks earlier into the bootup. This turns silent and
    sporadically deadly bugs into nice, deterministic splats. Fix some
    of the splats that triggered. (Thomas Gleixner)

    - A round of restructuring and refactoring of the load-balancing and
    topology code (Peter Zijlstra)

    - Another round of consolidating ~20 of incremental scheduler code
    history: this time in terms of wait-queue nomenclature. (I didn't
    get much feedback on these renaming patches, and we can still
    easily change any names I might have misplaced, so if anyone hates
    a new name, please holler and I'll fix it.) (Ingo Molnar)

    - sched/numa improvements, fixes and updates (Rik van Riel)

    - Another round of x86/tsc scheduler clock code improvements, in hope
    of making it more robust (Peter Zijlstra)

    - Improve NOHZ behavior (Frederic Weisbecker)

    - Deadline scheduler improvements and fixes (Luca Abeni, Daniel
    Bristot de Oliveira)

    - Simplify and optimize the topology setup code (Lauro Ramos
    Venancio)

    - Debloat and decouple scheduler code some more (Nicolas Pitre)

    - Simplify code by making better use of llist primitives (Byungchul
    Park)

    - ... plus other fixes and improvements"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
    sched/cputime: Refactor the cputime_adjust() code
    sched/debug: Expose the number of RT/DL tasks that can migrate
    sched/numa: Hide numa_wake_affine() from UP build
    sched/fair: Remove effective_load()
    sched/numa: Implement NUMA node level wake_affine()
    sched/fair: Simplify wake_affine() for the single socket case
    sched/numa: Override part of migrate_degrades_locality() when idle balancing
    sched/rt: Move RT related code from sched/core.c to sched/rt.c
    sched/deadline: Move DL related code from sched/core.c to sched/deadline.c
    sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled
    sched/fair: Spare idle load balancing on nohz_full CPUs
    nohz: Move idle balancer registration to the idle path
    sched/loadavg: Generalize "_idle" naming to "_nohz"
    sched/core: Drop the unused try_get_task_struct() helper function
    sched/fair: WARN() and refuse to set buddy when !se->on_rq
    sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well
    sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming
    sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c
    sched/wait: Split out the wait_bit*() APIs from into
    sched/wait: Re-adjust macro line continuation backslashes in
    ...

    Linus Torvalds
     

23 Jun, 2017

1 commit

  • Make CONFIG_CPUSETS=y depend on SMP as this feature makes no sense
    on UP. This allows for configuring out cpuset_cpumask_can_shrink()
    and task_can_attach() entirely, which shrinks the kernel a bit.

    Signed-off-by: Nicolas Pitre
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20170614171926.8345-2-nicolas.pitre@linaro.org
    Signed-off-by: Ingo Molnar

    Nicolas Pitre
     

15 Jun, 2017

1 commit


09 Jun, 2017

6 commits

  • RCU's Kconfig options are scattered, and there are enough of them
    that it would be good for them to be more centralized. This commit
    therefore extracts RCU's Kconfig options from init/Kconfig into a new
    kernel/rcu/Kconfig file.

    Reported-by: Ingo Molnar
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The CONFIG_RCU_NOCB_CPU_ALL, CONFIG_RCU_NOCB_CPU_NONE, and
    CONFIG_RCU_NOCB_CPU_ZERO Kconfig options are used only in testing and
    are redundant with the rcu_nocbs= boot parameter. This commit therefore
    removes these three Kconfig options and adjusts the rcutorture scripts
    to use the boot parameter instead.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • RCU's debugfs tracing used to be the only reasonable low-level debug
    information available, but ftrace and event tracing has since surpassed
    the RCU debugfs level of usefulness. This commit therefore removes
    RCU's debugfs tracing.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Classic SRCU was only ever intended to be a fallback in case of issues
    with Tree/Tiny SRCU, and the latter two are doing quite well in testing.
    This commit therefore removes Classic SRCU.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Anything that can be done with the RCU_KTHREAD_PRIO Kconfig option can
    also be done with the rcutree.kthread_prio kernel boot parameter.
    This commit therefore removes this Kconfig option.

    Reported-by: Linus Torvalds
    Signed-off-by: Paul E. McKenney
    Cc: Frederic Weisbecker
    Cc: Rik van Riel

    Paul E. McKenney
     
  • The rcu_segcblist structure provides quite a bit of functionality, and
    Tiny SRCU needs almost none of it. So this commit replaces Tiny SRCU's
    uses of rcu_segcblist with a simple singly linked list with tail pointer.
    This change significantly reduces Tiny SRCU's memory footprint, more
    than making up for the growth caused by the creation of rcu_segcblist.c

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

08 Jun, 2017

1 commit

  • Commit d160a727c40e ("srcu: Make SRCU be built by default") in response
    to build errors, which were caused by code that included srcu.h
    despite !SRCU. However, srcutiny.o is almost 2K of code, which is not
    insignificant for those attempting to run the Linux kernel on IoT devices.
    This commit therefore makes SRCU be once again optional, and adjusts
    srcu.h to allow error-free inclusion in !SRCU kernel builds.

    Signed-off-by: Paul E. McKenney
    Acked-by: Nicolas Pitre

    Paul E. McKenney
     

02 May, 2017

1 commit


24 Apr, 2017

2 commits

  • SRCU is optional, and included only if there is a "select SRCU" in effect.
    However, we now have Tiny SRCU, so this commit defaults CONFIG_SRCU=y.

    Reported-by: kbuild test robot
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • If the CONFIG_SRCU option is not selected, for example, when building
    arch/tile allnoconfig, the following build errors appear:

    kernel/rcu/tree.o: In function `srcu_online_cpu':
    tree.c:(.text+0x4248): multiple definition of `srcu_online_cpu'
    kernel/rcu/srcutree.o:srcutree.c:(.text+0x2120): first defined here
    kernel/rcu/tree.o: In function `srcu_offline_cpu':
    tree.c:(.text+0x4250): multiple definition of `srcu_offline_cpu'
    kernel/rcu/srcutree.o:srcutree.c:(.text+0x2160): first defined here

    The corresponding .config file shows CONFIG_TREE_SRCU=y, but no sign
    of CONFIG_SRCU, which fatally confuses SRCU's #ifdefs, resulting in
    the above errors. The reason this occurs is the folowing line in
    init/Kconfig's definition for TREE_SRCU:

    default y if !TINY_RCU && !CLASSIC_SRCU

    If CONFIG_CLASSIC_SRCU=n, as it will be in for allnoconfig, and if
    CONFIG_SMP=y, then we will get CONFIG_TREE_SRCU=y but no CONFIG_SRCU,
    as seen in the .config file, and which will result in the above errors.
    This error did not show up during rcutorture testing because rcutorture
    forces CONFIG_SRCU=y, as it must to prevent build errors in rcutorture.c.

    This commit therefore conditions TREE_SRCU (and TINY_SRCU, while it is
    at it) with SRCU, like this:

    default y if SRCU && !TINY_RCU && !CLASSIC_SRCU

    Reported-by: kbuild test robot
    Reported-by: Ingo Molnar
    Signed-off-by: Paul E. McKenney
    Link: http://lkml.kernel.org/r/20170423162205.GP3956@linux.vnet.ibm.com
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

21 Apr, 2017

1 commit


20 Apr, 2017

1 commit


19 Apr, 2017

2 commits

  • The TREE_SRCU rewrite is large and a bit on the non-simple side, so
    this commit helps reduce risk by allowing the old v4.11 SRCU algorithm
    to be selected using a new CLASSIC_SRCU Kconfig option that depends
    on RCU_EXPERT. The default is to use the new TREE_SRCU and TINY_SRCU
    algorithms, in order to help get these the testing that they need.
    However, if your users do not require the update-side scalability that
    is to be provided by TREE_SRCU, select RCU_EXPERT and then CLASSIC_SRCU
    to revert back to the old classic SRCU algorithm.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • In response to automated complaints about modifications to SRCU
    increasing its size, this commit creates a tiny SRCU that is
    used in SMP=n && PREEMPT=n builds.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

28 Feb, 2017

1 commit

  • Pull cgroup updates from Tejun Heo:
    "Several noteworthy changes.

    - Parav's rdma controller is finally merged. It is very straight
    forward and can limit the abosolute numbers of common rdma
    constructs used by different cgroups.

    - kernel/cgroup.c got too chubby and disorganized. Created
    kernel/cgroup/ subdirectory and moved all cgroup related files
    under kernel/ there and reorganized the core code. This hurts for
    backporting patches but was long overdue.

    - cgroup v2 process listing reimplemented so that it no longer
    depends on allocating a buffer large enough to cache the entire
    result to sort and uniq the output. v2 has always mangled the sort
    order to ensure that users don't depend on the sorted output, so
    this shouldn't surprise anybody. This makes the pid listing
    functions use the same iterators that are used internally, which
    have to have the same iterating capabilities anyway.

    - perf cgroup filtering now works automatically on cgroup v2. This
    patch was posted a long time ago but somehow fell through the
    cracks.

    - misc fixes asnd documentation updates"

    * 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits)
    kernfs: fix locking around kernfs_ops->release() callback
    cgroup: drop the matching uid requirement on migration for cgroup v2
    cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
    cgroup: misc cleanups
    cgroup: call subsys->*attach() only for subsystems which are actually affected by migration
    cgroup: track migration context in cgroup_mgctx
    cgroup: cosmetic update to cgroup_taskset_add()
    rdmacg: Fixed uninitialized current resource usage
    cgroup: Add missing cgroup-v2 PID controller documentation.
    rdmacg: Added documentation for rdmacg
    IB/core: added support to use rdma cgroup controller
    rdmacg: Added rdma cgroup controller
    cgroup: fix a comment typo
    cgroup: fix RCU related sparse warnings
    cgroup: move namespace code to kernel/cgroup/namespace.c
    cgroup: rename functions for consistency
    cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c
    cgroup: separate out cgroup1_kf_syscall_ops
    cgroup: refactor mount path and clearly distinguish v1 and v2 paths
    cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c
    ...

    Linus Torvalds
     

23 Feb, 2017

4 commits

  • Merge updates from Andrew Morton:
    "142 patches:

    - DAX updates

    - various misc bits

    - OCFS2 updates

    - most of MM"

    * emailed patches from Andrew Morton : (142 commits)
    mm/z3fold.c: limit first_num to the actual range of possible buddy indexes
    mm: fix stray kernel-doc notation
    zram: remove obsolete sysfs attrs
    mm/memblock.c: remove unnecessary log and clean up
    oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA
    mm: drop unused argument of zap_page_range()
    mm: drop zap_details::check_swap_entries
    mm: drop zap_details::ignore_dirty
    mm, page_alloc: warn_alloc nodemask is NULL when cpusets are disabled
    mm: help __GFP_NOFAIL allocations which do not trigger OOM killer
    mm, oom: do not enforce OOM killer for __GFP_NOFAIL automatically
    mm: consolidate GFP_NOFAIL checks in the allocator slowpath
    lib/show_mem.c: teach show_mem to work with the given nodemask
    arch, mm: remove arch specific show_mem
    mm, page_alloc: warn_alloc print nodemask
    mm, page_alloc: do not report all nodes in show_mem
    Revert "mm: bail out in shrink_inactive_list()"
    mm, vmscan: consider eligible zones in get_scan_count
    mm, vmscan: cleanup lru size claculations
    mm, vmscan: do not count freed pages as PGDEACTIVATE
    ...

    Linus Torvalds
     
  • Pull printk updates from Petr Mladek:

    - Add Petr Mladek, Sergey Senozhatsky as printk maintainers, and Steven
    Rostedt as the printk reviewer. This idea came up after the
    discussion about printk issues at Kernel Summit. It was formulated
    and discussed at lkml[1].

    - Extend a lock-less NMI per-cpu buffers idea to handle recursive
    printk() calls by Sergey Senozhatsky[2]. It is the first step in
    sanitizing printk as discussed at Kernel Summit.

    The change allows to see messages that would normally get ignored or
    would cause a deadlock.

    Also it allows to enable lockdep in printk(). This already paid off.
    The testing in linux-next helped to discover two old problems that
    were hidden before[3][4].

    - Remove unused parameter by Sergey Senozhatsky. Clean up after a past
    change.

    [1] http://lkml.kernel.org/r/1481798878-31898-1-git-send-email-pmladek@suse.com
    [2] http://lkml.kernel.org/r/20161227141611.940-1-sergey.senozhatsky@gmail.com
    [3] http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
    [4] http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
    printk: drop call_console_drivers() unused param
    printk: convert the rest to printk-safe
    printk: remove zap_locks() function
    printk: use printk_safe buffers in printk
    printk: report lost messages in printk safe/nmi contexts
    printk: always use deferred printk when flush printk_safe lines
    printk: introduce per-cpu safe_print seq buffer
    printk: rename nmi.c and exported api
    printk: use vprintk_func in vprintk()
    MAINTAINERS: Add printk maintainers

    Linus Torvalds
     
  • SLUB creates a per-cache directory under /sys/kernel/slab which hosts a
    bunch of debug files. Usually, there aren't that many caches on a
    system and this doesn't really matter; however, if memcg is in use, each
    cache can have per-cgroup sub-caches. SLUB creates the same directories
    for these sub-caches under /sys/kernel/slab/$CACHE/cgroup.

    Unfortunately, because there can be a lot of cgroups, active or
    draining, the product of the numbers of caches, cgroups and files in
    each directory can reach a very high number - hundreds of thousands is
    commonplace. Millions and beyond aren't difficult to reach either.

    What's under /sys/kernel/slab is primarily for debugging and the
    information and control on the a root cache already cover its
    sub-caches. While having a separate directory for each sub-cache can be
    helpful for development, it doesn't make much sense to pay this amount
    of overhead by default.

    This patch introduces a boot parameter slub_memcg_sysfs which determines
    whether to create sysfs directories for per-memcg sub-caches. It also
    adds CONFIG_SLUB_MEMCG_SYSFS_ON which determines the boot parameter's
    default value and defaults to 0.

    [akpm@linux-foundation.org: kset_unregister(NULL) is legal]
    Link: http://lkml.kernel.org/r/20170204145203.GB26958@mtj.duckdns.org
    Signed-off-by: Tejun Heo
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Vladimir Davydov
    Cc: Michal Hocko
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Pull char/misc driver updates from Greg KH:
    "Here is the big char/misc driver patchset for 4.11-rc1.

    Lots of different driver subsystems updated here: rework for the
    hyperv subsystem to handle new platforms better, mei and w1 and extcon
    driver updates, as well as a number of other "minor" driver updates.

    All of these have been in linux-next for a while with no reported
    issues"

    * tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (169 commits)
    goldfish: Sanitize the broken interrupt handler
    x86/platform/goldfish: Prevent unconditional loading
    vmbus: replace modulus operation with subtraction
    vmbus: constify parameters where possible
    vmbus: expose hv_begin/end_read
    vmbus: remove conditional locking of vmbus_write
    vmbus: add direct isr callback mode
    vmbus: change to per channel tasklet
    vmbus: put related per-cpu variable together
    vmbus: callback is in softirq not workqueue
    binder: Add support for file-descriptor arrays
    binder: Add support for scatter-gather
    binder: Add extra size to allocator
    binder: Refactor binder_transact()
    binder: Support multiple /dev instances
    binder: Deal with contexts in debugfs
    binder: Support multiple context managers
    binder: Split flat_binder_object
    auxdisplay: ht16k33: remove private workqueue
    auxdisplay: ht16k33: rework input device initialization
    ...

    Linus Torvalds
     

21 Feb, 2017

1 commit

  • Pull RCU updates from Ingo Molnar:
    "The RCU changes in this cycle are:

    - Dynticks updates, consolidating open-coded counter accesses into a
    well-defined API

    - SRCU updates: Simplify algorithm, add formal verification

    - Documentation updates

    - Miscellaneous fixes

    - Torture-test updates

    Most of the diffstat comes from the relatively large documentation
    update"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
    srcu: Reduce probability of SRCU ->unlock_count[] counter overflow
    rcutorture: Add CBMC-based formal verification for SRCU
    srcu: Force full grace-period ordering
    srcu: Implement more-efficient reader counts
    rcu: Adjust FQS offline checks for exact online-CPU detection
    rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead
    rcu: Abstract extended quiescent state determination
    rcu: Abstract dynticks extended quiescent state enter/exit operations
    rcu: Add lockdep checks to synchronous expedited primitives
    rcu: Eliminate unused expedited_normal counter
    llist: Clarify comments about when locking is needed
    rcu: Fix comment in rcu_organize_nocb_kthreads()
    rcu: Enable RCU tracepoints by default to aid in debugging
    rcu: Make rcu_cpu_starting() use its "cpu" argument
    rcu: Add comment headers to expedited-grace-period counter functions
    rcu: Don't wake rcuc/X kthreads on NOCB CPUs
    rcu: Re-enable TASKS_RCU for User Mode Linux
    rcu: Once again use NMI-based stack traces in stall warnings
    rcu: Remove short-term CPU kicking
    rcu: Add long-term CPU kicking
    ...

    Linus Torvalds
     

08 Feb, 2017

1 commit

  • A preparation patch for printk_safe work. No functional change.
    - rename nmi.c to print_safe.c
    - add `printk_safe' prefix to some (which used both by printk-safe
    and printk-nmi) of the exported functions.

    Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Jan Kara
    Cc: Tejun Heo
    Cc: Calvin Owens
    Cc: Steven Rostedt
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Andy Lutomirski
    Cc: Peter Hurley
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek

    Sergey Senozhatsky
     

06 Feb, 2017

1 commit


04 Feb, 2017

1 commit

  • This add the kbuild infrastructure that will allow architectures to emit
    vmlinux symbol CRCs as 32-bit offsets to another location in the kernel
    where the actual value is stored. This works around problems with CRCs
    being mistaken for relocatable symbols on kernels that self relocate at
    runtime (i.e., powerpc with CONFIG_RELOCATABLE=y)

    For the kbuild side of things, this comes down to the following:

    - introducing a Kconfig symbol MODULE_REL_CRCS

    - adding a -R switch to genksyms to instruct it to emit the CRC symbols
    as references into the .rodata section

    - making modpost distinguish such references from absolute CRC symbols
    by the section index (SHN_ABS)

    - making kallsyms disregard non-absolute symbols with a __crc_ prefix

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Linus Torvalds

    Ard Biesheuvel
     

31 Jan, 2017

1 commit


24 Jan, 2017

1 commit


19 Jan, 2017

1 commit

  • PC/104 form factor devices serve a specific niche of embedded system
    users; most Linux users will not have PC/104 form factor devices. This
    patch introduces the PC104 Kconfig option, which should be used to
    filter PC/104 specific device drivers and options, so that only those
    users interested in PC/104 related options are exposed to them.

    Signed-off-by: William Breathitt Gray
    Signed-off-by: Greg Kroah-Hartman

    William Breathitt Gray
     

17 Jan, 2017

1 commit

  • RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
    synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
    process. There should be no reason why one does not want this and there
    is no need worry about real time latency at this point.
    Therefore make it default.

    Note that users wishing to avoid expediting entirely, for example when
    bringing up new hardware possibly having flaky IPIs, can use the
    rcu_normal boot parameter to override boot-time expediting.

    Signed-off-by: Sebastian Andrzej Siewior
    [ paulmck: Reworded commit log. ]
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Sebastian Andrzej Siewior
     

11 Jan, 2017

2 commits

  • We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
    not right when CONFIG_NET is disabled and there is no socket interface:

    warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)

    I don't know what the correct solution for this is, but simply removing
    the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
    'if NET' section avoids the warning and does not produce other build
    errors.

    Fixes: 483c4933ea09 ("cgroup: Fix CGROUP_BPF config")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • Added rdma cgroup controller that does accounting, limit enforcement
    on rdma/IB resources.

    Added rdma cgroup header file which defines its APIs to perform
    charging/uncharging functionality. It also defined APIs for RDMA/IB
    stack for device registration. Devices which are registered will
    participate in controller functions of accounting and limit
    enforcements. It define rdmacg_device structure to bind IB stack
    and RDMA cgroup controller.

    RDMA resources are tracked using resource pool. Resource pool is per
    device, per cgroup entity which allows setting up accounting limits
    on per device basis.

    Currently resources are defined by the RDMA cgroup.

    Resource pool is created/destroyed dynamically whenever
    charging/uncharging occurs respectively and whenever user
    configuration is done. Its a tradeoff of memory vs little more code
    space that creates resource pool object whenever necessary, instead of
    creating them during cgroup creation and device registration time.

    Signed-off-by: Parav Pandit
    Signed-off-by: Tejun Heo

    Parav Pandit
     

18 Dec, 2016

2 commits

  • Pull networking fixes and cleanups from David Miller:

    1) Revert bogus nla_ok() change, from Alexey Dobriyan.

    2) Various bpf validator fixes from Daniel Borkmann.

    3) Add some necessary SET_NETDEV_DEV() calls to hsis_femac and hip04
    drivers, from Dongpo Li.

    4) Several ethtool ksettings conversions from Philippe Reynes.

    5) Fix bugs in inet port management wrt. soreuseport, from Tom Herbert.

    6) XDP support for virtio_net, from John Fastabend.

    7) Fix NAT handling within a vrf, from David Ahern.

    8) Endianness fixes in dpaa_eth driver, from Claudiu Manoil

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits)
    net: mv643xx_eth: fix build failure
    isdn: Constify some function parameters
    mlxsw: spectrum: Mark split ports as such
    cgroup: Fix CGROUP_BPF config
    qed: fix old-style function definition
    net: ipv6: check route protocol when deleting routes
    r6040: move spinlock in r6040_close as SOFTIRQ-unsafe lock order detected
    irda: w83977af_ir: cleanup an indent issue
    net: sfc: use new api ethtool_{get|set}_link_ksettings
    net: davicom: dm9000: use new api ethtool_{get|set}_link_ksettings
    net: cirrus: ep93xx: use new api ethtool_{get|set}_link_ksettings
    net: chelsio: cxgb3: use new api ethtool_{get|set}_link_ksettings
    net: chelsio: cxgb2: use new api ethtool_{get|set}_link_ksettings
    bpf: fix mark_reg_unknown_value for spilled regs on map value marking
    bpf: fix overflow in prog accounting
    bpf: dynamically allocate digest scratch buffer
    gtp: Fix initialization of Flags octet in GTPv1 header
    gtp: gtp_check_src_ms_ipv4() always return success
    net/x25: use designated initializers
    isdn: use designated initializers
    ...

    Linus Torvalds
     
  • CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually
    enabled, making it rather challenging to turn CGROUP_BPF on.

    Signed-off-by: Andy Lutomirski
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Andy Lutomirski