02 Mar, 2016
1 commit
-
Make it possible to write a target state to the per cpu state file, so we can
switch between states.Signed-off-by: Thomas Gleixner
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel
Cc: Rafael Wysocki
Cc: "Srivatsa S. Bhat"
Cc: Peter Zijlstra
Cc: Arjan van de Ven
Cc: Sebastian Siewior
Cc: Rusty Russell
Cc: Steven Rostedt
Cc: Oleg Nesterov
Cc: Tejun Heo
Cc: Andrew Morton
Cc: Paul McKenney
Cc: Linus Torvalds
Cc: Paul Turner
Link: http://lkml.kernel.org/r/20160226182341.022814799@linutronix.de
Signed-off-by: Thomas Gleixner
10 Feb, 2016
1 commit
-
Workqueue used to guarantee local execution for work items queued
without explicit target CPU. The guarantee is gone now which can
break some usages in subtle ways. To flush out those cases, this
patch implements a debug feature which forces round-robin CPU
selection for all such work items.The debug feature defaults to off and can be enabled with a kernel
parameter. The default can be flipped with a debug config option.If you hit this commit during bisection, please refer to 041bd12e272c
("Revert "workqueue: make sure delayed work run in local cpu"") for
more information and ping me.Signed-off-by: Tejun Heo
22 Jan, 2016
1 commit
-
Merge third patch-bomb from Andrew Morton:
"I'm pretty much done for -rc1 now:- the rest of MM, basically
- lib/ updates
- checkpatch, epoll, hfs, fatfs, ptrace, coredump, exit
- cpu_mask simplifications
- kexec, rapidio, MAINTAINERS etc, etc.
- more dma-mapping cleanups/simplifications from hch"
* emailed patches from Andrew Morton : (109 commits)
MAINTAINERS: add/fix git URLs for various subsystems
mm: memcontrol: add "sock" to cgroup2 memory.stat
mm: memcontrol: basic memory statistics in cgroup2 memory controller
mm: memcontrol: do not uncharge old page in page cache replacement
Documentation: cgroup: add memory.swap.{current,max} description
mm: free swap cache aggressively if memcg swap is full
mm: vmscan: do not scan anon pages if memcg swap limit is hit
swap.h: move memcg related stuff to the end of the file
mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online
mm: vmscan: pass memcg to get_scan_count()
mm: memcontrol: charge swap to cgroup2
mm: memcontrol: clean up alloc, online, offline, free functions
mm: memcontrol: flatten struct cg_proto
mm: memcontrol: rein in the CONFIG space madness
net: drop tcp_memcontrol.c
mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
mm: memcontrol: allow to disable kmem accounting for cgroup2
mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
mm: memcontrol: separate kmem code from legacy tcp accounting code
...
21 Jan, 2016
2 commits
-
Larry Finger reports:
"My PowerBook G4 Aluminum with a 32-bit PPC processor fails to boot for
the 4.4-git series".This is likely due to X still needing /dev/mem access on this platform.
CONFIG_IO_STRICT_DEVMEM is not yet safe to turn on when
CONFIG_STRICT_DEVMEM=y.Remove the default so that old configurations do not change behavior.
Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
Reported-by: Larry Finger
Tested-by: Larry Finger
Link: http://marc.info/?l=linux-kernel&m=145332012023825&w=2
Acked-by: Kees Cook
Cc: Arnd Bergmann
Cc: Ingo Molnar
Cc: Russell King
Cc: Andrew Morton
Cc: Greg Kroah-Hartman
Signed-off-by: Dan Williams
Signed-off-by: Linus Torvalds -
UBSAN uses compile-time instrumentation to catch undefined behavior
(UB). Compiler inserts code that perform certain kinds of checks before
operations that could cause UB. If check fails (i.e. UB detected)
__ubsan_handle_* function called to print error message.So the most of the work is done by compiler. This patch just implements
ubsan handlers printing errors.GCC has this capability since 4.9.x [1] (see -fsanitize=undefined
option and its suboptions).
However GCC 5.x has more checkers implemented [2].
Article [3] has a bit more details about UBSAN in the GCC.[1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
[2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
[3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/Issues which UBSAN has found thus far are:
Found bugs:
* out-of-bounds access - 97840cb67ff5 ("netfilter: nfnetlink: fix
insufficient validation in nfnetlink_bind")undefined shifts:
* d48458d4a768 ("jbd2: use a better hash function for the revoke
table")* 10632008b9e1 ("clockevents: Prevent shift out of bounds")
* 'x << -1' shift in ext4 -
http://lkml.kernel.org/r/* undefined rol32(0) -
http://lkml.kernel.org/r/* undefined dirty_ratelimit calculation -
http://lkml.kernel.org/r/* undefined roundown_pow_of_two(0) -
http://lkml.kernel.org/r/* [WONTFIX] undefined shift in __bpf_prog_run -
http://lkml.kernel.org/r/WONTFIX here because it should be fixed in bpf program, not in kernel.
signed overflows:
* 32a8df4e0b33f ("sched: Fix odd values in effective_load()
calculations")* mul overflow in ntp -
http://lkml.kernel.org/r/* incorrect conversion into rtc_time in rtc_time64_to_tm() -
http://lkml.kernel.org/r/* unvalidated timespec in io_getevents() -
http://lkml.kernel.org/r/* [NOTABUG] signed overflow in ktime_add_safe() -
http://lkml.kernel.org/r/[akpm@linux-foundation.org: fix unused local warning]
[akpm@linux-foundation.org: fix __int128 build woes]
Signed-off-by: Andrey Ryabinin
Cc: Peter Zijlstra
Cc: Sasha Levin
Cc: Randy Dunlap
Cc: Rasmus Villemoes
Cc: Jonathan Corbet
Cc: Michal Marek
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Yury Gribov
Cc: Dmitry Vyukov
Cc: Konstantin Khlebnikov
Cc: Kostya Serebryany
Cc: Johannes Berg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Jan, 2016
1 commit
-
As illustrated by commit a3afe70b83fd ("[S390] latencytop s390
support."), HAVE_LATENCYTOP_SUPPORT is defined by an architecture to
advertise an implementation of save_stack_trace_tsk.However, as of 9212ddb5eada ("stacktrace: provide save_stack_trace_tsk()
weak alias") a dummy implementation is provided if STACKTRACE=y. Given
that LATENCYTOP already depends on STACKTRACE_SUPPORT and selects
STACKTRACE, we can remove HAVE_LATENCYTOP_SUPPORT altogether.Signed-off-by: Will Deacon
Acked-by: Heiko Carstens
Cc: Vineet Gupta
Cc: Russell King
Cc: James Hogan
Cc: Michal Simek
Cc: Helge Deller
Acked-by: Michael Ellerman
Cc: "David S. Miller"
Cc: Guan Xuetao
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Jan, 2016
1 commit
-
This patch adds a third argument to macros which create function
definitions for page flags. This argument defines how page-flags
helpers behave on compound functions.For now we define four policies:
- PF_ANY: the helper function operates on the page it gets, regardless
if it's non-compound, head or tail.- PF_HEAD: the helper function operates on the head page of the
compound page if it gets tail page.- PF_NO_TAIL: only head and non-compond pages are acceptable for this
helper function.- PF_NO_COMPOUND: only non-compound pages are acceptable for this
helper function.For now we use policy PF_ANY for all helpers, which matches current
behaviour.We do not enforce the policy for TESTPAGEFLAG, because we have flags
checked for random pages all over the kernel. Noticeable exception to
this is PageTransHuge() which triggers VM_BUG_ON() for tail page.Signed-off-by: Kirill A. Shutemov
Cc: Andrea Arcangeli
Cc: Hugh Dickins
Cc: Dave Hansen
Cc: Mel Gorman
Cc: Rik van Riel
Cc: Vlastimil Babka
Cc: Christoph Lameter
Cc: Naoya Horiguchi
Cc: Steve Capper
Cc: "Aneesh Kumar K.V"
Cc: Johannes Weiner
Cc: Michal Hocko
Cc: Jerome Marchand
Cc: Jérôme Glisse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Jan, 2016
1 commit
-
Pull libnvdimm updates from Dan Williams:
"The bulk of this has appeared in -next and independently received a
build success notification from the kbuild robot. The 'for-4.5/block-
dax' topic branch was rebased over the weekend to drop the "block
device end-of-life" rework that Al would like to see re-implemented
with a notifier, and to address bug reports against the badblocks
integration.There is pending feedback against "libnvdimm: Add a poison list and
export badblocks" received last week. Linda identified some localized
fixups that we will handle incrementally.Summary:
- Media error handling: The 'badblocks' implementation that
originated in md-raid is up-levelled to a generic capability of a
block device. This initial implementation is limited to being
consulted in the pmem block-i/o path. Later, 'badblocks' will be
consulted when creating dax mappings.- Raw block device dax: For virtualization and other cases that want
large contiguous mappings of persistent memory, add the capability
to dax-mmap a block device directly.- Increased /dev/mem restrictions: Add an option to treat all
io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access
while a driver is actively using an address range. This behavior
is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be
overridden by the existing "iomem=relaxed" kernel command line
option.- Miscellaneous fixes include a 'pfn'-device huge page alignment fix,
block device shutdown crash fix, and other small libnvdimm fixes"* tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (32 commits)
block: kill disk_{check|set|clear|alloc}_badblocks
libnvdimm, pmem: nvdimm_read_bytes() badblocks support
pmem, dax: disable dax in the presence of bad blocks
pmem: fail io-requests to known bad blocks
libnvdimm: convert to statically allocated badblocks
libnvdimm: don't fail init for full badblocks list
block, badblocks: introduce devm_init_badblocks
block: clarify badblocks lifetime
badblocks: rename badblocks_free to badblocks_exit
libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h
libnvdimm: Add a poison list and export badblocks
nfit_test: Enable DSMs for all test NFITs
md: convert to use the generic badblocks code
block: Add badblock management for gendisks
badblocks: Add core badblock management code
block: fix del_gendisk() vs blkdev_ioctl crash
block: enable dax for raw block devices
block: introduce bdev_file_inode()
restrict /dev/mem to idle io memory ranges
arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
...
13 Jan, 2016
1 commit
-
Pull networking updates from Davic Miller:
1) Support busy polling generically, for all NAPI drivers. From Eric
Dumazet.2) Add byte/packet counter support to nft_ct, from Floriani Westphal.
3) Add RSS/XPS support to mvneta driver, from Gregory Clement.
4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes
Frederic Sowa.5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai.
6) Add support for VLAN device bridging to mlxsw switch driver, from
Ido Schimmel.7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski.
8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko.
9) Reorganize wireless drivers into per-vendor directories just like we
do for ethernet drivers. From Kalle Valo.10) Provide a way for administrators "destroy" connected sockets via the
SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti.11) Add support to add/remove multicast routes via netlink, from Nikolay
Aleksandrov.12) Make TCP keepalive settings per-namespace, from Nikolay Borisov.
13) Add forwarding and packet duplication facilities to nf_tables, from
Pablo Neira Ayuso.14) Dead route support in MPLS, from Roopa Prabhu.
15) TSO support for thunderx chips, from Sunil Goutham.
16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon.
17) Rationalize, consolidate, and more completely document the checksum
offloading facilities in the networking stack. From Tom Herbert.18) Support aborting an ongoing scan in mac80211/cfg80211, from
Vidyullatha Kanchanapally.19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits)
net: bnxt: always return values from _bnxt_get_max_rings
net: bpf: reject invalid shifts
phonet: properly unshare skbs in phonet_rcv()
dwc_eth_qos: Fix dma address for multi-fragment skbs
phy: remove an unneeded condition
mdio: remove an unneed condition
mdio_bus: NULL dereference on allocation error
net: Fix typo in netdev_intersect_features
net: freescale: mac-fec: Fix build error from phy_device API change
net: freescale: ucc_geth: Fix build error from phy_device API change
bonding: Prevent IPv6 link local address on enslaved devices
IB/mlx5: Add flow steering support
net/mlx5_core: Export flow steering API
net/mlx5_core: Make ipv4/ipv6 location more clear
net/mlx5_core: Enable flow steering support for the IB driver
net/mlx5_core: Initialize namespaces only when supported by device
net/mlx5_core: Set priority attributes
net/mlx5_core: Connect flow tables
net/mlx5_core: Introduce modify flow table command
net/mlx5_core: Managing root flow table
...
12 Jan, 2016
1 commit
-
Pull MMC updates from Ulf Hansson:
"MMC core:
- Optimize boot time by detecting cards simultaneously
- Make runtime resume default behavior for MMC/SD
- Enable MMC/SD/SDIO devices to suspend/resume asynchronously
- Allow more than 8 partitions per card
- Introduce MMC_CAP2_NO_SDIO to prevent unsupported SDIO commands
- Support the standard DT wakeup-source property
- Fix driver strength switching for HS200 and HS400
- Fix switch command timeout
- Fix invalid vdd in voltage switch power cycle for SDIOMMC host:
- sdhci: Restore behavior when setting VDD via external regulator
- sdhci: A couple of changes/fixes related to the dma support
- sdhci-tegra: Add Tegra210 support
- sdhci-tegra: Support for UHS-I cards including tuning support
- sdhci-of-at91: Add PM support
- sh_mmcif: Rework dma channel handling
- mvsdio: Delete platform data code path"* tag 'mmc-v4.5' of git://git.linaro.org/people/ulf.hansson/mmc: (52 commits)
mmc: dw_mmc: remove the unused quirks
mmc: sdhci-pci: use to_pci_dev()
mmc: cb710: use to_platform_device()
mmc: tegra: use correct accessor for misc ctrl register
mmc: tegra: enable UHS-I modes
mmc: tegra: implement UHS tuning
mmc: tegra: disable SPI_MODE_CLKEN
mmc: tegra: implement module external clock change
mmc: sdhci: restore behavior when setting VDD via external regulator
mmc: It is not an error for the card to be removed while suspended
mmc: block: Allow more than 8 partitions per card
mmc: core: Optimize boot time by detecting cards simultaneously
mmc: dw_mmc: use resource_size_t to store physical address
mmc: core: fix __mmc_switch timeout caused by preempt
mmc: usdhi6rol0: handle NULL data in timeout
mmc: of_mmc_spi: Add IRQF_ONESHOT to interrupt flags
mmc: mediatek: change some dev_err to dev_dbg
mmc: enable MMC/SD/SDIO device to suspend/resume asynchronously
mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off()
mmc: sdhci: 64-bit DMA actually has 4-byte alignment
...
09 Jan, 2016
2 commits
-
This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
semantics by default. If userspace really believes it is safe to access
the memory region it can also perform the extra step of disabling an
active driver. This protects device address ranges with read side
effects and otherwise directs userspace to use the driver.Persistent memory presents a large "mistake surface" to /dev/mem as now
accidental writes can corrupt a filesystem.In general if a device driver is busily using a memory region it already
informs other parts of the kernel to not touch it via
request_mem_region(). /dev/mem should honor the same safety restriction
by default. Debugging a device driver from userspace becomes more
difficult with this enabled. Any application using /dev/mem or mmap of
sysfs pci resources will now need to perform the extra step of either:1/ Disabling the driver, for example:
echo > /dev/bus//drivers//unbind
2/ Rebooting with "iomem=relaxed" on the command line
3/ Recompiling with CONFIG_IO_STRICT_DEVMEM=n
Traditional users of /dev/mem like dosemu are unaffected because the
first 1MB of memory is not subject to the IO_STRICT_DEVMEM restriction.
Legacy X configurations use /dev/mem to talk to graphics hardware, but
that functionality has since moved to kernel graphics drivers.Cc: Arnd Bergmann
Cc: Russell King
Cc: Andrew Morton
Cc: Greg Kroah-Hartman
Acked-by: Kees Cook
Acked-by: Ingo Molnar
Signed-off-by: Dan Williams -
Let all the archs that implement devmem_is_allowed() opt-in to a common
definition of CONFIG_STRICT_DEVM in lib/Kconfig.debug.Cc: Kees Cook
Cc: Russell King
Cc: Will Deacon
Cc: Benjamin Herrenschmidt
Cc: Martin Schwidefsky
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Andrew Morton
Cc: Greg Kroah-Hartman
Cc: "David S. Miller"
Acked-by: Catalin Marinas
Acked-by: Heiko Carstens
[heiko: drop 'default y' for s390]
Acked-by: Ingo Molnar
Suggested-by: Arnd Bergmann
Signed-off-by: Dan Williams
22 Dec, 2015
1 commit
-
Fault-injection capability for MMC IO uses debugfs entries to configure
the attributes.
FAULT_INJECTION_DEBUG_FS must be enabled to use FAIL_MMC_REQUEST.Replace FAULT_INJECTION with FAULT_INJECTION_DEBUG_FS.
Also remove 'select DEBUG_FS' since FAULT_INJECTION_DEBUG_FS depends on
it.Signed-off-by: Adrien Schildknecht
Signed-off-by: Ulf Hansson
09 Dec, 2015
1 commit
-
Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING. These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.To alleviate the situation, this patch implements workqueue lockup
detector. It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
workqueue events_power_efficient: flags=0x80
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
pending: check_lifetime, neigh_periodic_work
workqueue cgroup_pidlist_destroy: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
pending: cgroup_pidlist_destroy_work_fn
...The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.v2: Decoupled from softlockup control knobs.
Signed-off-by: Tejun Heo
Acked-by: Don Zickus
Cc: Ulrich Obergfell
Cc: Michal Hocko
Cc: Chris Mason
Cc: Andrew Morton
02 Dec, 2015
1 commit
-
This module allows to insert errors in some of netdevice's notifier
events. All network drivers use these notifiers to signal various events
and to check if they are allowed, e.g. PRECHANGEMTU and CHANGEMTU
afterwards. Until recently I had to run failure tests by injecting
a custom module, but now this infrastructure makes it trivial to test
these failure paths. Some of the recent bugs I fixed were found using
this module.
Here's an example:
$ cd /sys/kernel/debug/notifier-error-inject/netdev
$ echo -22 > actions/NETDEV_CHANGEMTU/error
$ ip link set eth0 mtu 1024
RTNETLINK answers: Invalid argumentCC: Akinobu Mita
CC: "David S. Miller"
CC: netdev
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
10 Nov, 2015
1 commit
-
Pull module updates from Rusty Russell:
"Nothing exciting, minor tweaks and cleanups"* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
scripts: [modpost] add new sections to white list
modpost: Add flag -E for making section mismatches fatal
params: don't ignore the rest of cmdline if parse_one() fails
modpost: abort if a module symbol is too long
07 Nov, 2015
1 commit
-
This adds a simple module for testing the kernel's printf facilities.
Previously, some %p extensions have caused a wrong return value in case
the entire output didn't fit and/or been unusable in kasprintf(). This
should help catch such issues. Also, it should help ensure that changes
to the formatting algorithms don't break anything.I'm not sure if we have a struct dentry or struct file lying around at
boot time or if we can fake one, but most %p extensions should be
testable, as should the ordinary number and string formatting.The nature of vararg functions means we can't use a more conventional
table-driven approach.For now, this is mostly a skeleton; contributions are very
welcome. Some tests are/will be slightly annoying to write, since the
expected output depends on stuff like CONFIG_*, sizeof(long), runtime
values etc.Signed-off-by: Rasmus Villemoes
Reviewed-by: Kees Cook
Cc: Andy Shevchenko
Cc: Martin Kletzander
Cc: Rasmus Villemoes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Oct, 2015
1 commit
-
When the kernel compiled with KASAN=y, GCC adds redzones for each
variable on stack. This enlarges function's stack frame and causes:'warning: the frame size of X bytes is larger than Y bytes'
The worst case I've seen for now is following:
../net/wireless/nl80211.c: In function `nl80211_send_wiphy':
../net/wireless/nl80211.c:1731:1: warning: the frame size of 5448 bytes is larger than 2048 bytes [-Wframe-larger-than=]That kind of warning becomes useless with KASAN=y. It doesn't
necessarily indicate that there is some problem in the code, thus we
should turn it off.(The KASAN=y stack size in increased from 16k to 32k for this reason)
Signed-off-by: Andrey Ryabinin
Reported-by: Fengguang Wu
Acked-by: Abylay Ospan
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Mauro Carvalho Chehab
Cc: Kozlov Sergey
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Oct, 2015
1 commit
-
The section mismatch warning can be easy to miss during the kernel build
process. Allow it to be marked as fatal to be easily caught and prevent
bugs from slipping in.Setting CONFIG_SECTION_MISMATCH_WARN_ONLY=y causes these warnings to be
non-fatal, since there are a number of section mismatches when using
allmodconfig on some architectures, and we do not want to break these
builds by default.Signed-off-by: Nicolas Boichat
Change-Id: Ic346706e3297c9f0d790e3552aa94e5cff9897a6
Signed-off-by: Rusty Russell
04 Sep, 2015
1 commit
-
Pull locking and atomic updates from Ingo Molnar:
"Main changes in this cycle are:- Extend atomic primitives with coherent logic op primitives
(atomic_{or,and,xor}()) and deprecate the old partial APIs
(atomic_{set,clear}_mask())The old ops were incoherent with incompatible signatures across
architectures and with incomplete support. Now every architecture
supports the primitives consistently (by Peter Zijlstra)- Generic support for 'relaxed atomics':
- _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
- atomic_read_acquire()
- atomic_set_release()This came out of porting qwrlock code to arm64 (by Will Deacon)
- Clean up the fragile static_key APIs that were causing repeat bugs,
by introducing a new one:DEFINE_STATIC_KEY_TRUE(name);
DEFINE_STATIC_KEY_FALSE(name);which define a key of different types with an initial true/false
value.Then allow:
static_branch_likely()
static_branch_unlikely()to take a key of either type and emit the right instruction for the
case. To be able to know the 'type' of the static key we encode it
in the jump entry (by Peter Zijlstra)- Static key self-tests (by Jason Baron)
- qrwlock optimizations (by Waiman Long)
- small futex enhancements (by Davidlohr Bueso)
- ... and misc other changes"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
jump_label/x86: Work around asm build bug on older/backported GCCs
locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
locking/static_keys: Make verify_keys() static
jump label, locking/static_keys: Update docs
locking/static_keys: Provide a selftest
jump_label: Provide a self-test
s390/uaccess, locking/static_keys: employ static_branch_likely()
x86, tsc, locking/static_keys: Employ static_branch_likely()
locking/static_keys: Add selftest
locking/static_keys: Add a new static_key interface
locking/static_keys: Rework update logic
locking/static_keys: Add static_key_{en,dis}able() helpers
...
04 Aug, 2015
1 commit
-
fixes.2015.07.22a: Miscellaneous fixes.
initexp.2015.08.04a: Initialization and expedited updates.
(Single branch due to conflicts.)
03 Aug, 2015
2 commits
-
The 'jump label' self-test is in reality testing static keys - rename things
accordingly.Also prettify the code in various places while at it.
Acked-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Jason Baron
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Shuah Khan
Cc: Thomas Gleixner
Cc: benh@kernel.crashing.org
Cc: bp@alien8.de
Cc: davem@davemloft.net
Cc: ddaney@caviumnetworks.com
Cc: heiko.carstens@de.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: liuj97@gmail.com
Cc: luto@amacapital.net
Cc: michael@ellerman.id.au
Cc: rabin@rab.in
Cc: ralf@linux-mips.org
Cc: rostedt@goodmis.org
Cc: vbabka@suse.cz
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/0c091ecebd78a879ed8a71835d205a691a75ab4e.1438227999.git.jbaron@akamai.com
Signed-off-by: Ingo Molnar -
Signed-off-by: Jason Baron
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: benh@kernel.crashing.org
Cc: bp@alien8.de
Cc: davem@davemloft.net
Cc: ddaney@caviumnetworks.com
Cc: heiko.carstens@de.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: liuj97@gmail.com
Cc: luto@amacapital.net
Cc: michael@ellerman.id.au
Cc: rabin@rab.in
Cc: ralf@linux-mips.org
Cc: rostedt@goodmis.org
Cc: shuahkh@osg.samsung.com
Cc: vbabka@suse.cz
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/0c091ecebd78a879ed8a71835d205a691a75ab4e.1438227999.git.jbaron@akamai.com
Signed-off-by: Ingo Molnar
23 Jul, 2015
1 commit
-
Reported-by: Geert Uytterhoeven
Signed-off-by: Paul E. McKenney
20 Jul, 2015
2 commits
-
No one uses this anymore, and this is not the first time the
idea of replacing it with a (now possible) userspace side.
Lock stealing logic was removed long ago in when the lock
was granted to the highest prio.Signed-off-by: Davidlohr Bueso
Cc: Darren Hart
Cc: Steven Rostedt
Cc: Mike Galbraith
Cc: Paul E. McKenney
Cc: Sebastian Andrzej Siewior
Cc: Davidlohr Bueso
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1435782588-4177-2-git-send-email-dave@stgolabs.net
Signed-off-by: Thomas Gleixner -
Although futexes are well known for being a royal pita,
we really have very little debugging capabilities - except
for relying on tglx's eye half the time.By simply making use of the existing fault-injection machinery,
we can improve this situation, allowing generating artificial
uaddress faults and deadlock scenarios. Of course, when this is
disabled in production systems, the overhead for failure checks
is practically zero -- so this is very cheap at the same time.
Future work would be nice to now enhance trinity to make use of
this.There is a special tunable 'ignore-private', which can filter
out private futexes. Given the tsk->make_it_fail filter and
this option, pi futexes can be narrowed down pretty closely.Signed-off-by: Davidlohr Bueso
Cc: Peter Zijlstra
Cc: Darren Hart
Cc: Davidlohr Bueso
Link: http://lkml.kernel.org/r/1435645562-975-3-git-send-email-dave@stgolabs.net
Signed-off-by: Thomas Gleixner
18 Jul, 2015
1 commit
-
The CONFIG_RCU_CPU_STALL_INFO has been default-y for a couple of
releases with no complaints, so it is time to eliminate this Kconfig
option entirely, so that the long-form RCU CPU stall warnings cannot
be disabled. This commit does just that.Signed-off-by: Paul E. McKenney
04 Jul, 2015
1 commit
-
Both CONFIG_SCHEDSTATS=y and CONFIG_TASK_DELAY_ACCT=y track task
sched_info, which results in ugly #if clauses.Simplify the code by introducing a synthethic CONFIG_SCHED_INFO
switch, selected by both.Signed-off-by: Naveen N. Rao
Cc: Balbir Singh
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Srikar Dronamraju
Cc: Thomas Gleixner
Cc: a.p.zijlstra@chello.nl
Cc: ricklind@us.ibm.com
Link: http://lkml.kernel.org/r/8d19eef800811a94b0f91bcbeb27430a884d7433.1435255405.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar
28 May, 2015
3 commits
-
This commit applies some warning-omission micro-optimizations to RCU's
various extended-quiescent-state functions, which are on the kernel/user
hotpath for CONFIG_NO_HZ_FULL=y.Reported-by: Rik van Riel
Reported by: Mike Galbraith
Signed-off-by: Paul E. McKenney -
Currently, Kconfig will ask the user whether TASKS_RCU should be set.
This is silly because Kconfig already has all the information that it
needs to set this parameter. This commit therefore directly drives
the value of TASKS_RCU via "select" statements. Which means that
as subsystems require TASKS_RCU, those subsystems will need to add
"select" statements of their own.Reported-by: Ingo Molnar
Signed-off-by: Paul E. McKenney
Cc: Steven Rostedt
Reviewed-by: Pranith Kumar -
Grace-period scans of the rcu_node combining tree normally
proceed quite quickly, so that it is very difficult to reproduce
races against them. This commit therefore allows grace-period
pre-initialization and cleanup to be artificially slowed down,
increasing race-reproduction probability. A pair of pairs of new
Kconfig parameters are provided, RCU_TORTURE_TEST_SLOW_PREINIT to
enable the slowing down of propagating CPU-hotplug changes up the
combining tree along with RCU_TORTURE_TEST_SLOW_PREINIT_DELAY to
specify the delay in jiffies, and RCU_TORTURE_TEST_SLOW_CLEANUP
to enable the slowing down of the end-of-grace-period cleanup scan
along with RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY to specify the delay
in jiffies. Boot-time parameters named rcutree.gp_preinit_delay and
rcutree.gp_cleanup_delay allow these delays to be specified at boot time.Signed-off-by: Paul E. McKenney
07 May, 2015
1 commit
-
Pull RCU fix from Ingo Molnar:
"An RCU Kconfig fix that eliminates an annoying interactive kconfig
question for CONFIG_RCU_TORTURE_TEST_SLOW_INIT"* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Control grace-period delays directly from value
18 Apr, 2015
1 commit
-
…k/linux-rcu into core/urgent
Pull RCU fix from Paul E. McKenney:
"This series contains a single change that fixes Kconfig asking pointless
questions."Signed-off-by: Ingo Molnar <mingo@kernel.org>
15 Apr, 2015
5 commits
-
In a misguided attempt to avoid an #ifdef, the use of the
gp_init_delay module parameter was conditioned on the corresponding
RCU_TORTURE_TEST_SLOW_INIT Kconfig variable, using IS_ENABLED() at
the point of use in the code. This meant that the compiler always saw
the delay, which meant that RCU_TORTURE_TEST_SLOW_INIT_DELAY had to be
unconditionally defined. This in turn caused "make oldconfig" to ask
pointless questions about the value of RCU_TORTURE_TEST_SLOW_INIT_DELAY
in cases where it was not even used.This commit avoids these pointless questions by defining gp_init_delay
under #ifdef. In one branch, gp_init_delay is initialized to
RCU_TORTURE_TEST_SLOW_INIT_DELAY and is also a module parameter (thus
allowing boot-time modification), and in the other branch gp_init_delay
is a const variable initialized by default to zero.This approach also simplifies the code at the delay point by eliminating
the IS_DEFINED(). Because gp_init_delay is constant zero in the no-delay
case intended for production use, the "gp_init_delay > 0" check causes
the delay to become dead code, as desired in this case. In addition,
this commit replaces magic constant "10" with the preprocessor variable
PER_RCU_NODE_PERIOD, which controls the number of grace periods that
are allowed to elapse at full speed before a delay is inserted.Reported-by: Linus Torvalds Signed-off-by:
Paul E. McKenney -
Merge first patchbomb from Andrew Morton:
- arch/sh updates
- ocfs2 updates
- kernel/watchdog feature
- about half of mm/
* emailed patches from Andrew Morton : (122 commits)
Documentation: update arch list in the 'memtest' entry
Kconfig: memtest: update number of test patterns up to 17
arm: add support for memtest
arm64: add support for memtest
memtest: use phys_addr_t for physical addresses
mm: move memtest under mm
mm, hugetlb: abort __get_user_pages if current has been oom killed
mm, mempool: do not allow atomic resizing
memcg: print cgroup information when system panics due to panic_on_oom
mm: numa: remove migrate_ratelimited
mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE
mm: split ET_DYN ASLR from mmap ASLR
s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE
mm: expose arch_mmap_rnd when available
s390: standardize mmap_rnd() usage
powerpc: standardize mmap_rnd() usage
mips: extract logic for mmap_rnd()
arm64: standardize mmap_rnd() usage
x86: standardize mmap_rnd() usage
arm: factor out mmap ASLR into mmap_rnd
... -
Additional test patterns for memtest were introduced since commit
63823126c221 ("x86: memtest: add additional (regular) test patterns"),
but looks like Kconfig was not updated that time.Update Kconfig entry with the actual number of maximum test patterns.
Signed-off-by: Vladimir Murzin
Cc: "H. Peter Anvin"
Cc: Catalin Marinas
Cc: Ingo Molnar
Cc: Mark Rutland
Cc: Russell King
Cc: Thomas Gleixner
Cc: Will Deacon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Memtest is a simple feature which fills the memory with a given set of
patterns and validates memory contents, if bad memory regions is detected
it reserves them via memblock API. Since memblock API is widely used by
other architectures this feature can be enabled outside of x86 world.This patch set promotes memtest to live under generic mm umbrella and
enables memtest feature for arm/arm64.It was reported that this patch set was useful for tracking down an issue
with some errant DMA on an arm64 platform.This patch (of 6):
There is nothing platform dependent in the core memtest code, so other
platforms might benefit from this feature too.[linux@roeck-us.net: MEMTEST depends on MEMBLOCK]
Signed-off-by: Vladimir Murzin
Acked-by: Will Deacon
Tested-by: Mark Rutland
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Catalin Marinas
Cc: Russell King
Cc: Paul Bolle
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pull RCU changes from Ingo Molnar:
"The main changes in this cycle were:- changes permitting use of call_rcu() and friends very early in
boot, for example, before rcu_init() is invoked.- add in-kernel API to enable and disable expediting of normal RCU
grace periods.- improve RCU's handling of (hotplug-) outgoing CPUs.
- NO_HZ_FULL_SYSIDLE fixes.
- tiny-RCU updates to make it more tiny.
- documentation updates.
- miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well
cpu: Defer smpboot kthread unparking until CPU known to scheduler
rcu: Associate quiescent-state reports with grace period
rcu: Yet another fix for preemption and CPU hotplug
rcu: Add diagnostics to grace-period cleanup
rcutorture: Default to grace-period-initialization delays
rcu: Handle outgoing CPUs on exit from idle loop
cpu: Make CPU-offline idle-loop transition point more precise
rcu: Eliminate ->onoff_mutex from rcu_node structure
rcu: Process offlining and onlining only at grace-period start
rcu: Move rcu_report_unblock_qs_rnp() to common code
rcu: Rework preemptible expedited bitmask handling
rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs
rcutorture: Enable slow grace-period initializations
rcu: Provide diagnostic option to slow down grace-period initialization
rcu: Detect stalls caused by failure to propagate up rcu_node tree
rcu: Eliminate empty HOTPLUG_CPU ifdef
rcu: Simplify sync_rcu_preempt_exp_init()
rcu: Put all orphan-callback-related code under same comment
rcu: Consolidate offline-CPU callback initialization
...
20 Mar, 2015
1 commit
-
…pexp.2015.02.26a', 'hotplug.2015.03.20a', 'sysidle.2015.02.26b' and 'tiny.2015.02.26a' into HEAD
doc.2015.02.26a: Documentation changes
earlycb.2015.03.03a: Permit early-boot RCU callbacks
fixes.2015.03.03a: Miscellaneous fixes
gpexp.2015.02.26a: In-kernel expediting of normal grace periods
hotplug.2015.03.20a: CPU hotplug fixes
sysidle.2015.02.26b: NO_HZ_FULL_SYSIDLE fixes
tiny.2015.02.26a: TINY_RCU fixes
13 Mar, 2015
1 commit
-
Recently there's been requests for better sanity
checking in the time code, so that it's more clear
when something is going wrong, since timekeeping issues
could manifest in a large number of strange ways in
various subsystems.Thus, this patch adds some extra infrastructure to
add a check to update_wall_time() to print two new
warnings:1) if we see the call delayed beyond the 'max_cycles'
overflow point,2) or if we see the call delayed beyond the clocksource's
'max_idle_ns' value, which is currently 50% of the
overflow point.This extra infrastructure is conditional on
a new CONFIG_DEBUG_TIMEKEEPING option, also
added in this patch - default off.Tested this a bit by halting qemu for specified
lengths of time to trigger the warnings.Signed-off-by: John Stultz
Cc: Dave Jones
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Prarit Bhargava
Cc: Richard Cochran
Cc: Stephen Boyd
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1426133800-29329-5-git-send-email-john.stultz@linaro.org
[ Improved the changelog and the messages a bit. ]
Signed-off-by: Ingo Molnar