Eric Lee / smarc-fsl-linux-kernel

31 Jan, 2018

1 commit

6fde36d5c bpf: introduce BPF_JIT_ALWAYS_ON config ... Browse Code »

[ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ]

The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."

To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64

The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)

v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
It will be sent when the trees are merged back to net-next

Considered doing:
int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.

Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann
Signed-off-by: Greg Kroah-Hartman

Alexei Starovoitov
2018-01-31 21:03:49 +0800

07 Oct, 2017

1 commit

2cc3ce24a kbuild: Fix optimization level choice default ... Browse Code »

The choice containing the CC_OPTIMIZE_FOR_PERFORMANCE symbol
accidentally added a "CONFIG_" prefix when trying to make it the
default, selecting an undefined symbol as the default.

The mistake is harmless here: Since the default symbol is not visible,
the choice falls back on using the visible symbol as the default
instead, which is CC_OPTIMIZE_FOR_PERFORMANCE, as intended.

A patch that makes Kconfig print a warning in this case has been
submitted separately:
http://www.spinics.net/lists/linux-kbuild/msg15566.html

Signed-off-by: Ulf Magnusson
Acked-by: Arnd Bergmann
Signed-off-by: Masahiro Yamada

Ulf Magnusson
2017-10-07 19:08:05 +0800

07 Sep, 2017

1 commit

2482ddec6 mm: add SLUB free list pointer obfuscation ... Browse Code »

This SLUB free list pointer obfuscation code is modified from Brad
Spengler/PaX Team's code in the last public patch of grsecurity/PaX
based on my understanding of the code. Changes or omissions from the
original code are mine and don't reflect the original grsecurity/PaX
code.

This adds a per-cache random value to SLUB caches that is XORed with
their freelist pointer address and value. This adds nearly zero
overhead and frustrates the very common heap overflow exploitation
method of overwriting freelist pointers.

A recent example of the attack is written up here:

http://cyseclabs.com/blog/cve-2016-6187-heap-off-by-one-exploit

and there is a section dedicated to the technique the book "A Guide to
Kernel Exploitation: Attacking the Core".

This is based on patches by Daniel Micay, and refactored to minimize the
use of #ifdef.

With 200-count cycles of "hackbench -g 20 -l 1000" I saw the following
run times:

before:
mean 10.11882499999999999995
variance .03320378329145728642
stdev .18221905304181911048

after:
mean 10.12654000000000000014
variance .04700556623115577889
stdev .21680767106160192064

The difference gets lost in the noise, but if the above is to be taken
literally, using CONFIG_FREELIST_HARDENED is 0.07% slower.

Link: http://lkml.kernel.org/r/20170802180609.GA66807@beast
Signed-off-by: Kees Cook
Suggested-by: Daniel Micay
Cc: Rik van Riel
Cc: Tycho Andersen
Cc: Alexander Popov
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2017-09-07 08:27:24 +0800

01 Aug, 2017

1 commit

bc2eecd7e futex: Allow for compiling out PI support ... Browse Code »

This makes it possible to preserve basic futex support and compile out the
PI support when RT mutexes are not available.

Signed-off-by: Nicolas Pitre
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Darren Hart
Link: http://lkml.kernel.org/r/alpine.LFD.2.20.1708010024190.5981@knanqh.ubzr

Nicolas Pitre
2017-08-01 20:36:35 +0800

07 Jul, 2017

2 commits

7660a6fdd mm: allow slab_nomerge to be set at build time ... Browse Code »

Some hardened environments want to build kernels with slab_nomerge
already set (so that they do not depend on remembering to set the kernel
command line option). This is desired to reduce the risk of kernel heap
overflows being able to overwrite objects from merged caches and changes
the requirements for cache layout control, increasing the difficulty of
these attacks. By keeping caches unmerged, these kinds of exploits can
usually only damage objects in the same cache (though the risk to
metadata exploitation is unchanged).

Link: http://lkml.kernel.org/r/20170620230911.GA25238@beast
Signed-off-by: Kees Cook
Cc: Daniel Micay
Cc: David Windsor
Cc: Eric Biggers
Cc: Christoph Lameter
Cc: Jonathan Corbet
Cc: Daniel Micay
Cc: David Windsor
Cc: Eric Biggers
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: "Rafael J. Wysocki"
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Mauro Carvalho Chehab
Cc: "Paul E. McKenney"
Cc: Arnd Bergmann
Cc: Andy Lutomirski
Cc: Nicolas Pitre
Cc: Tejun Heo
Cc: Daniel Mack
Cc: Sebastian Andrzej Siewior
Cc: Sergey Senozhatsky
Cc: Helge Deller
Cc: Rik van Riel
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2017-07-07 07:24:31 +0800
9ced560b8 Merge branch 'for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup changes from Tejun Heo:

- Waiman made the debug controller work and a lot more useful on
cgroup2

- There were a couple issues with cgroup subtree delegation. The
documentation on delegating to a non-root user was missing some part
and cgroup namespace support wasn't factoring in delegation at all.
The documentation is updated and the now there is a mount option to
make cgroup namespace fit for delegation

* 'for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: implement "nsdelegate" mount option
cgroup: restructure cgroup_procs_write_permission()
cgroup: "cgroup.subtree_control" should be writeable by delegatee
cgroup: fix lockdep warning in debug controller
cgroup: refactor cgroup_masks_read() in the debug controller
cgroup: make debug an implicit controller on cgroup2
cgroup: Make debug cgroup support v2 and thread mode
cgroup: Make Kconfig prompt of debug cgroup more accurate
cgroup: Move debug cgroup to its own file
cgroup: Keep accurate count of tasks in each css_set

Linus Torvalds
2017-07-07 00:52:09 +0800

04 Jul, 2017

1 commit

9bd42183b Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:

- Add the SYSTEM_SCHEDULING bootup state to move various scheduler
debug checks earlier into the bootup. This turns silent and
sporadically deadly bugs into nice, deterministic splats. Fix some
of the splats that triggered. (Thomas Gleixner)

- A round of restructuring and refactoring of the load-balancing and
topology code (Peter Zijlstra)

- Another round of consolidating ~20 of incremental scheduler code
history: this time in terms of wait-queue nomenclature. (I didn't
get much feedback on these renaming patches, and we can still
easily change any names I might have misplaced, so if anyone hates
a new name, please holler and I'll fix it.) (Ingo Molnar)

- sched/numa improvements, fixes and updates (Rik van Riel)

- Another round of x86/tsc scheduler clock code improvements, in hope
of making it more robust (Peter Zijlstra)

- Improve NOHZ behavior (Frederic Weisbecker)

- Deadline scheduler improvements and fixes (Luca Abeni, Daniel
Bristot de Oliveira)

- Simplify and optimize the topology setup code (Lauro Ramos
Venancio)

- Debloat and decouple scheduler code some more (Nicolas Pitre)

- Simplify code by making better use of llist primitives (Byungchul
Park)

- ... plus other fixes and improvements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
sched/cputime: Refactor the cputime_adjust() code
sched/debug: Expose the number of RT/DL tasks that can migrate
sched/numa: Hide numa_wake_affine() from UP build
sched/fair: Remove effective_load()
sched/numa: Implement NUMA node level wake_affine()
sched/fair: Simplify wake_affine() for the single socket case
sched/numa: Override part of migrate_degrades_locality() when idle balancing
sched/rt: Move RT related code from sched/core.c to sched/rt.c
sched/deadline: Move DL related code from sched/core.c to sched/deadline.c
sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled
sched/fair: Spare idle load balancing on nohz_full CPUs
nohz: Move idle balancer registration to the idle path
sched/loadavg: Generalize "_idle" naming to "_nohz"
sched/core: Drop the unused try_get_task_struct() helper function
sched/fair: WARN() and refuse to set buddy when !se->on_rq
sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well
sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming
sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c
sched/wait: Split out the wait_bit*() APIs from into
sched/wait: Re-adjust macro line continuation backslashes in
...

Linus Torvalds
2017-07-04 04:08:04 +0800

23 Jun, 2017

1 commit

e1d4eeec5 sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled ... Browse Code »

Make CONFIG_CPUSETS=y depend on SMP as this feature makes no sense
on UP. This allows for configuring out cpuset_cpumask_can_shrink()
and task_can_attach() entirely, which shrinks the kernel a bit.

Signed-off-by: Nicolas Pitre
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20170614171926.8345-2-nicolas.pitre@linaro.org
Signed-off-by: Ingo Molnar

Nicolas Pitre
2017-06-23 16:46:44 +0800

15 Jun, 2017

1 commit

23b0be480 cgroup: Make Kconfig prompt of debug cgroup more accurate ... Browse Code »

The Kconfig prompt and description of the debug cgroup controller
more accurate by saying that it is for debug purpose only and its
interfaces are unstable.

Signed-off-by: Waiman Long
Signed-off-by: Tejun Heo

Waiman Long
2017-06-15 04:01:21 +0800

09 Jun, 2017

6 commits

0af92d460 rcu: Move RCU non-debug Kconfig options to kernel/rcu ... Browse Code »

RCU's Kconfig options are scattered, and there are enough of them
that it would be good for them to be more centralized. This commit
therefore extracts RCU's Kconfig options from init/Kconfig into a new
kernel/rcu/Kconfig file.

Reported-by: Ingo Molnar
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-06-09 09:52:44 +0800
44c65ff2e rcu: Eliminate NOCBs CPU-state Kconfig options ... Browse Code »

The CONFIG_RCU_NOCB_CPU_ALL, CONFIG_RCU_NOCB_CPU_NONE, and
CONFIG_RCU_NOCB_CPU_ZERO Kconfig options are used only in testing and
are redundant with the rcu_nocbs= boot parameter. This commit therefore
removes these three Kconfig options and adjusts the rcutorture scripts
to use the boot parameter instead.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-06-09 09:52:43 +0800
ae91aa0ad rcu: Remove debugfs tracing ... Browse Code »

RCU's debugfs tracing used to be the only reasonable low-level debug
information available, but ftrace and event tracing has since surpassed
the RCU debugfs level of usefulness. This commit therefore removes
RCU's debugfs tracing.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-06-09 09:52:43 +0800
bd8cc5a06 srcu: Remove Classic SRCU ... Browse Code »

Classic SRCU was only ever intended to be a fallback in case of issues
with Tree/Tiny SRCU, and the latter two are doing quite well in testing.
This commit therefore removes Classic SRCU.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-06-09 09:52:42 +0800
f7a10a975 rcu: Remove the RCU_KTHREAD_PRIO Kconfig option ... Browse Code »

Anything that can be done with the RCU_KTHREAD_PRIO Kconfig option can
also be done with the rcutree.kthread_prio kernel boot parameter.
This commit therefore removes this Kconfig option.

Reported-by: Linus Torvalds
Signed-off-by: Paul E. McKenney
Cc: Frederic Weisbecker
Cc: Rik van Riel

Paul E. McKenney
2017-06-09 09:52:39 +0800
2464dd940 srcu: Apply trivial callback lists to shrink Tiny SRCU ... Browse Code »

The rcu_segcblist structure provides quite a bit of functionality, and
Tiny SRCU needs almost none of it. So this commit replaces Tiny SRCU's
uses of rcu_segcblist with a simple singly linked list with tail pointer.
This change significantly reduces Tiny SRCU's memory footprint, more
than making up for the growth caused by the creation of rcu_segcblist.c

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-06-09 09:52:35 +0800

08 Jun, 2017

1 commit

07f6e64bf srcu: Make SRCU be once again optional ... Browse Code »

Commit d160a727c40e ("srcu: Make SRCU be built by default") in response
to build errors, which were caused by code that included srcu.h
despite !SRCU. However, srcutiny.o is almost 2K of code, which is not
insignificant for those attempting to run the Linux kernel on IoT devices.
This commit therefore makes SRCU be once again optional, and adjusts
srcu.h to allow error-free inclusion in !SRCU kernel builds.

Signed-off-by: Paul E. McKenney
Acked-by: Nicolas Pitre

Paul E. McKenney
2017-06-08 23:25:38 +0800

02 May, 2017

1 commit

98059b986 rcu: Separately compile large rcu_segcblist functions ... Browse Code »

This commit creates a new kernel/rcu/rcu_segcblist.c file that
contains non-trivial segcblist functions. Trivial functions
remain as static inline functions in kernel/rcu/rcu_segcblist.h

Reported-by: Linus Torvalds
Signed-off-by: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner

Paul E. McKenney
2017-05-02 22:21:02 +0800

24 Apr, 2017

2 commits

d160a727c srcu: Make SRCU be built by default ... Browse Code »

SRCU is optional, and included only if there is a "select SRCU" in effect.
However, we now have Tiny SRCU, so this commit defaults CONFIG_SRCU=y.

Reported-by: kbuild test robot
Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-04-24 23:36:02 +0800
677df9d46 srcu: Fix Kconfig botch when SRCU not selected ... Browse Code »

If the CONFIG_SRCU option is not selected, for example, when building
arch/tile allnoconfig, the following build errors appear:

kernel/rcu/tree.o: In function `srcu_online_cpu':
tree.c:(.text+0x4248): multiple definition of `srcu_online_cpu'
kernel/rcu/srcutree.o:srcutree.c:(.text+0x2120): first defined here
kernel/rcu/tree.o: In function `srcu_offline_cpu':
tree.c:(.text+0x4250): multiple definition of `srcu_offline_cpu'
kernel/rcu/srcutree.o:srcutree.c:(.text+0x2160): first defined here

The corresponding .config file shows CONFIG_TREE_SRCU=y, but no sign
of CONFIG_SRCU, which fatally confuses SRCU's #ifdefs, resulting in
the above errors. The reason this occurs is the folowing line in
init/Kconfig's definition for TREE_SRCU:

default y if !TINY_RCU && !CLASSIC_SRCU

If CONFIG_CLASSIC_SRCU=n, as it will be in for allnoconfig, and if
CONFIG_SMP=y, then we will get CONFIG_TREE_SRCU=y but no CONFIG_SRCU,
as seen in the .config file, and which will result in the above errors.
This error did not show up during rcutorture testing because rcutorture
forces CONFIG_SRCU=y, as it must to prevent build errors in rcutorture.c.

This commit therefore conditions TREE_SRCU (and TINY_SRCU, while it is
at it) with SRCU, like this:

default y if SRCU && !TINY_RCU && !CLASSIC_SRCU

Reported-by: kbuild test robot
Reported-by: Ingo Molnar
Signed-off-by: Paul E. McKenney
Link: http://lkml.kernel.org/r/20170423162205.GP3956@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar

Paul E. McKenney
2017-04-24 14:14:48 +0800

21 Apr, 2017

1 commit

f2094107a Merge branches 'doc.2017.04.12a', 'fixes.2017.04.19a' and 'srcu.2017.04.21a' into HEAD ... Browse Code »

doc.2017.04.12a: Documentation updates
fixes.2017.04.19a: Miscellaneous fixes
srcu.2017.04.21a: Parallelize SRCU callback handling

Paul E. McKenney
2017-04-21 21:00:13 +0800

20 Apr, 2017

1 commit

024828800 rcu: Make RCU_FANOUT_LEAF help text more explicit about skew_tick ... Browse Code »

If you set RCU_FANOUT_LEAF too high, you can get lock contention
on the leaf rcu_node, and you should boot with the skew_tick kernel
parameter set in order to avoid this lock contention. This commit
therefore upgrades the RCU_FANOUT_LEAF help text to explicitly state
this.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-04-20 00:29:17 +0800

19 Apr, 2017

2 commits

dad81a202 srcu: Introduce CLASSIC_SRCU Kconfig option ... Browse Code »

The TREE_SRCU rewrite is large and a bit on the non-simple side, so
this commit helps reduce risk by allowing the old v4.11 SRCU algorithm
to be selected using a new CLASSIC_SRCU Kconfig option that depends
on RCU_EXPERT. The default is to use the new TREE_SRCU and TINY_SRCU
algorithms, in order to help get these the testing that they need.
However, if your users do not require the update-side scalability that
is to be provided by TREE_SRCU, select RCU_EXPERT and then CLASSIC_SRCU
to revert back to the old classic SRCU algorithm.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-04-19 02:38:23 +0800
d8be81735 srcu: Create a tiny SRCU ... Browse Code »

In response to automated complaints about modifications to SRCU
increasing its size, this commit creates a tiny SRCU that is
used in SMP=n && PREEMPT=n builds.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2017-04-19 02:38:22 +0800

28 Feb, 2017

1 commit

f7878dc3a Merge branch 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup updates from Tejun Heo:
"Several noteworthy changes.

- Parav's rdma controller is finally merged. It is very straight
forward and can limit the abosolute numbers of common rdma
constructs used by different cgroups.

- kernel/cgroup.c got too chubby and disorganized. Created
kernel/cgroup/ subdirectory and moved all cgroup related files
under kernel/ there and reorganized the core code. This hurts for
backporting patches but was long overdue.

- cgroup v2 process listing reimplemented so that it no longer
depends on allocating a buffer large enough to cache the entire
result to sort and uniq the output. v2 has always mangled the sort
order to ensure that users don't depend on the sorted output, so
this shouldn't surprise anybody. This makes the pid listing
functions use the same iterators that are used internally, which
have to have the same iterating capabilities anyway.

- perf cgroup filtering now works automatically on cgroup v2. This
patch was posted a long time ago but somehow fell through the
cracks.

- misc fixes asnd documentation updates"

* 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits)
kernfs: fix locking around kernfs_ops->release() callback
cgroup: drop the matching uid requirement on migration for cgroup v2
cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
cgroup: misc cleanups
cgroup: call subsys->*attach() only for subsystems which are actually affected by migration
cgroup: track migration context in cgroup_mgctx
cgroup: cosmetic update to cgroup_taskset_add()
rdmacg: Fixed uninitialized current resource usage
cgroup: Add missing cgroup-v2 PID controller documentation.
rdmacg: Added documentation for rdmacg
IB/core: added support to use rdma cgroup controller
rdmacg: Added rdma cgroup controller
cgroup: fix a comment typo
cgroup: fix RCU related sparse warnings
cgroup: move namespace code to kernel/cgroup/namespace.c
cgroup: rename functions for consistency
cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c
cgroup: separate out cgroup1_kf_syscall_ops
cgroup: refactor mount path and clearly distinguish v1 and v2 paths
cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c
...

Linus Torvalds
2017-02-28 13:41:08 +0800

23 Feb, 2017

4 commits

bc49a7831 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge updates from Andrew Morton:
"142 patches:

- DAX updates

- various misc bits

- OCFS2 updates

- most of MM"

* emailed patches from Andrew Morton : (142 commits)
mm/z3fold.c: limit first_num to the actual range of possible buddy indexes
mm: fix stray kernel-doc notation
zram: remove obsolete sysfs attrs
mm/memblock.c: remove unnecessary log and clean up
oom-reaper: use madvise_dontneed() logic to decide if unmap the VMA
mm: drop unused argument of zap_page_range()
mm: drop zap_details::check_swap_entries
mm: drop zap_details::ignore_dirty
mm, page_alloc: warn_alloc nodemask is NULL when cpusets are disabled
mm: help __GFP_NOFAIL allocations which do not trigger OOM killer
mm, oom: do not enforce OOM killer for __GFP_NOFAIL automatically
mm: consolidate GFP_NOFAIL checks in the allocator slowpath
lib/show_mem.c: teach show_mem to work with the given nodemask
arch, mm: remove arch specific show_mem
mm, page_alloc: warn_alloc print nodemask
mm, page_alloc: do not report all nodes in show_mem
Revert "mm: bail out in shrink_inactive_list()"
mm, vmscan: consider eligible zones in get_scan_count
mm, vmscan: cleanup lru size claculations
mm, vmscan: do not count freed pages as PGDEACTIVATE
...

Linus Torvalds
2017-02-23 11:29:24 +0800
7d91de744 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk ... Browse Code »

Pull printk updates from Petr Mladek:

- Add Petr Mladek, Sergey Senozhatsky as printk maintainers, and Steven
Rostedt as the printk reviewer. This idea came up after the
discussion about printk issues at Kernel Summit. It was formulated
and discussed at lkml[1].

- Extend a lock-less NMI per-cpu buffers idea to handle recursive
printk() calls by Sergey Senozhatsky[2]. It is the first step in
sanitizing printk as discussed at Kernel Summit.

The change allows to see messages that would normally get ignored or
would cause a deadlock.

Also it allows to enable lockdep in printk(). This already paid off.
The testing in linux-next helped to discover two old problems that
were hidden before[3][4].

- Remove unused parameter by Sergey Senozhatsky. Clean up after a past
change.

[1] http://lkml.kernel.org/r/1481798878-31898-1-git-send-email-pmladek@suse.com
[2] http://lkml.kernel.org/r/20161227141611.940-1-sergey.senozhatsky@gmail.com
[3] http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
[4] http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
printk: drop call_console_drivers() unused param
printk: convert the rest to printk-safe
printk: remove zap_locks() function
printk: use printk_safe buffers in printk
printk: report lost messages in printk safe/nmi contexts
printk: always use deferred printk when flush printk_safe lines
printk: introduce per-cpu safe_print seq buffer
printk: rename nmi.c and exported api
printk: use vprintk_func in vprintk()
MAINTAINERS: Add printk maintainers

Linus Torvalds
2017-02-23 09:33:34 +0800
1663f26df slub: make sysfs directories for memcg sub-caches optional ... Browse Code »

SLUB creates a per-cache directory under /sys/kernel/slab which hosts a
bunch of debug files. Usually, there aren't that many caches on a
system and this doesn't really matter; however, if memcg is in use, each
cache can have per-cgroup sub-caches. SLUB creates the same directories
for these sub-caches under /sys/kernel/slab/$CACHE/cgroup.

Unfortunately, because there can be a lot of cgroups, active or
draining, the product of the numbers of caches, cgroups and files in
each directory can reach a very high number - hundreds of thousands is
commonplace. Millions and beyond aren't difficult to reach either.

What's under /sys/kernel/slab is primarily for debugging and the
information and control on the a root cache already cover its
sub-caches. While having a separate directory for each sub-cache can be
helpful for development, it doesn't make much sense to pay this amount
of overhead by default.

This patch introduces a boot parameter slub_memcg_sysfs which determines
whether to create sysfs directories for per-memcg sub-caches. It also
adds CONFIG_SLUB_MEMCG_SYSFS_ON which determines the boot parameter's
default value and defaults to 0.

[akpm@linux-foundation.org: kset_unregister(NULL) is legal]
Link: http://lkml.kernel.org/r/20170204145203.GB26958@mtj.duckdns.org
Signed-off-by: Tejun Heo
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: Vladimir Davydov
Cc: Michal Hocko
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2017-02-23 08:41:27 +0800
e30aee9e1 Merge tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc ... Browse Code »

Pull char/misc driver updates from Greg KH:
"Here is the big char/misc driver patchset for 4.11-rc1.

Lots of different driver subsystems updated here: rework for the
hyperv subsystem to handle new platforms better, mei and w1 and extcon
driver updates, as well as a number of other "minor" driver updates.

All of these have been in linux-next for a while with no reported
issues"

* tag 'char-misc-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (169 commits)
goldfish: Sanitize the broken interrupt handler
x86/platform/goldfish: Prevent unconditional loading
vmbus: replace modulus operation with subtraction
vmbus: constify parameters where possible
vmbus: expose hv_begin/end_read
vmbus: remove conditional locking of vmbus_write
vmbus: add direct isr callback mode
vmbus: change to per channel tasklet
vmbus: put related per-cpu variable together
vmbus: callback is in softirq not workqueue
binder: Add support for file-descriptor arrays
binder: Add support for scatter-gather
binder: Add extra size to allocator
binder: Refactor binder_transact()
binder: Support multiple /dev instances
binder: Deal with contexts in debugfs
binder: Support multiple context managers
binder: Split flat_binder_object
auxdisplay: ht16k33: remove private workqueue
auxdisplay: ht16k33: rework input device initialization
...

Linus Torvalds
2017-02-23 03:38:22 +0800

21 Feb, 2017

1 commit

f7458a5d6 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RCU updates from Ingo Molnar:
"The RCU changes in this cycle are:

- Dynticks updates, consolidating open-coded counter accesses into a
well-defined API

- SRCU updates: Simplify algorithm, add formal verification

- Documentation updates

- Miscellaneous fixes

- Torture-test updates

Most of the diffstat comes from the relatively large documentation
update"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
srcu: Reduce probability of SRCU ->unlock_count[] counter overflow
rcutorture: Add CBMC-based formal verification for SRCU
srcu: Force full grace-period ordering
srcu: Implement more-efficient reader counts
rcu: Adjust FQS offline checks for exact online-CPU detection
rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead
rcu: Abstract extended quiescent state determination
rcu: Abstract dynticks extended quiescent state enter/exit operations
rcu: Add lockdep checks to synchronous expedited primitives
rcu: Eliminate unused expedited_normal counter
llist: Clarify comments about when locking is needed
rcu: Fix comment in rcu_organize_nocb_kthreads()
rcu: Enable RCU tracepoints by default to aid in debugging
rcu: Make rcu_cpu_starting() use its "cpu" argument
rcu: Add comment headers to expedited-grace-period counter functions
rcu: Don't wake rcuc/X kthreads on NOCB CPUs
rcu: Re-enable TASKS_RCU for User Mode Linux
rcu: Once again use NMI-based stack traces in stall warnings
rcu: Remove short-term CPU kicking
rcu: Add long-term CPU kicking
...

Linus Torvalds
2017-02-21 03:21:17 +0800

08 Feb, 2017

1 commit

f92bac3b1 printk: rename nmi.c and exported api ... Browse Code »

A preparation patch for printk_safe work. No functional change.
- rename nmi.c to print_safe.c
- add `printk_safe' prefix to some (which used both by printk-safe
and printk-nmi) of the exported functions.

Link: http://lkml.kernel.org/r/20161227141611.940-3-sergey.senozhatsky@gmail.com
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Jan Kara
Cc: Tejun Heo
Cc: Calvin Owens
Cc: Steven Rostedt
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Andy Lutomirski
Cc: Peter Hurley
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sergey Senozhatsky
Signed-off-by: Petr Mladek

Sergey Senozhatsky
2017-02-08 18:02:33 +0800

06 Feb, 2017

1 commit

17fa87fe5 Merge 4.10-rc7 into char-misc-next ... Browse Code »

We want the hv and other fixes in here as well to handle merge and
testing issues.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-02-06 16:39:13 +0800

04 Feb, 2017

1 commit

56067812d kbuild: modversions: add infrastructure for emitting relative CRCs ... Browse Code »

This add the kbuild infrastructure that will allow architectures to emit
vmlinux symbol CRCs as 32-bit offsets to another location in the kernel
where the actual value is stored. This works around problems with CRCs
being mistaken for relocatable symbols on kernels that self relocate at
runtime (i.e., powerpc with CONFIG_RELOCATABLE=y)

For the kbuild side of things, this comes down to the following:

- introducing a Kconfig symbol MODULE_REL_CRCS

- adding a -R switch to genksyms to instruct it to emit the CRC symbols
as references into the .rodata section

- making modpost distinguish such references from absolute CRC symbols
by the section index (SHN_ABS)

- making kallsyms disregard non-absolute symbols with a __crc_ prefix

Signed-off-by: Ard Biesheuvel
Signed-off-by: Linus Torvalds

Ard Biesheuvel
2017-02-04 00:28:25 +0800

31 Jan, 2017

1 commit

a8709fa4a Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmc… ... Browse Code »

…k/linux-rcu into core/rcu

Pull RCU changes from Paul E. McKenney:

- Dynticks updates, consolidating open-coded counter accesses into a well-defined API

- SRCU updates: Simplify algorithm, add formal verification

- Documentation updates

- Miscellaneous fixes

- Torture-test updates

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2017-01-31 14:45:42 +0800

24 Jan, 2017

1 commit

1626c365f rcu: Re-enable TASKS_RCU for User Mode Linux ... Browse Code »

Now that User Mode Linux supports arch_irqs_disabled_flags(), this
commit re-enables TASKS_RCU for User Mode Linux.

Reported-by: Richard Weinberger
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
2017-01-24 03:37:12 +0800

19 Jan, 2017

1 commit

ad90a3de9 pc104: Introduce the PC104 Kconfig option ... Browse Code »

PC/104 form factor devices serve a specific niche of embedded system
users; most Linux users will not have PC/104 form factor devices. This
patch introduces the PC104 Kconfig option, which should be used to
filter PC/104 specific device drivers and options, so that only those
users interested in PC/104 related options are exposed to them.

Signed-off-by: William Breathitt Gray
Signed-off-by: Greg Kroah-Hartman

William Breathitt Gray
2017-01-19 19:42:25 +0800

17 Jan, 2017

1 commit

7c6094db5 rcu: update: Make RCU_EXPEDITE_BOOT be the default ... Browse Code »

RCU_EXPEDITE_BOOT should speed up the boot process by enforcing
synchronize_rcu_expedited() instead of synchronize_rcu() during the boot
process. There should be no reason why one does not want this and there
is no need worry about real time latency at this point.
Therefore make it default.

Note that users wishing to avoid expediting entirely, for example when
bringing up new hardware possibly having flaky IPIs, can use the
rcu_normal boot parameter to override boot-time expediting.

Signed-off-by: Sebastian Andrzej Siewior
[ paulmck: Reworded commit log. ]
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Sebastian Andrzej Siewior
2017-01-17 08:56:39 +0800

11 Jan, 2017

2 commits

73b351473 cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig ... Browse Code »

We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
not right when CONFIG_NET is disabled and there is no socket interface:

warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)

I don't know what the correct solution for this is, but simply removing
the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
'if NET' section avoids the warning and does not produce other build
errors.

Fixes: 483c4933ea09 ("cgroup: Fix CGROUP_BPF config")
Signed-off-by: Arnd Bergmann
Signed-off-by: David S. Miller

Arnd Bergmann
2017-01-11 22:47:10 +0800
39d3e7584 rdmacg: Added rdma cgroup controller ... Browse Code »

Added rdma cgroup controller that does accounting, limit enforcement
on rdma/IB resources.

Added rdma cgroup header file which defines its APIs to perform
charging/uncharging functionality. It also defined APIs for RDMA/IB
stack for device registration. Devices which are registered will
participate in controller functions of accounting and limit
enforcements. It define rdmacg_device structure to bind IB stack
and RDMA cgroup controller.

RDMA resources are tracked using resource pool. Resource pool is per
device, per cgroup entity which allows setting up accounting limits
on per device basis.

Currently resources are defined by the RDMA cgroup.

Resource pool is created/destroyed dynamically whenever
charging/uncharging occurs respectively and whenever user
configuration is done. Its a tradeoff of memory vs little more code
space that creates resource pool object whenever necessary, instead of
creating them during cgroup creation and device registration time.

Signed-off-by: Parav Pandit
Signed-off-by: Tejun Heo

Parav Pandit
2017-01-11 00:14:27 +0800

18 Dec, 2016

2 commits

52f40e9d6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes and cleanups from David Miller:

1) Revert bogus nla_ok() change, from Alexey Dobriyan.

2) Various bpf validator fixes from Daniel Borkmann.

3) Add some necessary SET_NETDEV_DEV() calls to hsis_femac and hip04
drivers, from Dongpo Li.

4) Several ethtool ksettings conversions from Philippe Reynes.

5) Fix bugs in inet port management wrt. soreuseport, from Tom Herbert.

6) XDP support for virtio_net, from John Fastabend.

7) Fix NAT handling within a vrf, from David Ahern.

8) Endianness fixes in dpaa_eth driver, from Claudiu Manoil

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits)
net: mv643xx_eth: fix build failure
isdn: Constify some function parameters
mlxsw: spectrum: Mark split ports as such
cgroup: Fix CGROUP_BPF config
qed: fix old-style function definition
net: ipv6: check route protocol when deleting routes
r6040: move spinlock in r6040_close as SOFTIRQ-unsafe lock order detected
irda: w83977af_ir: cleanup an indent issue
net: sfc: use new api ethtool_{get|set}_link_ksettings
net: davicom: dm9000: use new api ethtool_{get|set}_link_ksettings
net: cirrus: ep93xx: use new api ethtool_{get|set}_link_ksettings
net: chelsio: cxgb3: use new api ethtool_{get|set}_link_ksettings
net: chelsio: cxgb2: use new api ethtool_{get|set}_link_ksettings
bpf: fix mark_reg_unknown_value for spilled regs on map value marking
bpf: fix overflow in prog accounting
bpf: dynamically allocate digest scratch buffer
gtp: Fix initialization of Flags octet in GTPv1 header
gtp: gtp_check_src_ms_ipv4() always return success
net/x25: use designated initializers
isdn: use designated initializers
...

Linus Torvalds
2016-12-18 12:17:04 +0800
483c4933e cgroup: Fix CGROUP_BPF config ... Browse Code »

CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually
enabled, making it rather challenging to turn CGROUP_BPF on.

Signed-off-by: Andy Lutomirski
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Andy Lutomirski
2016-12-18 10:42:45 +0800