Eric Lee / smarc-fsl-linux-kernel

26 Jun, 2015

8 commits

6fe29354b printk: implement support for extended console drivers ... Browse Code »

printk log_buf keeps various metadata for each message including its
sequence number and timestamp. The metadata is currently available only
through /dev/kmsg and stripped out before passed onto console drivers. We
want this metadata to be available to console drivers too so that console
consumers can get full information including the metadata and dictionary,
which among other things can be used to detect whether messages got lost
in transit.

This patch implements support for extended console drivers. Consoles can
indicate that they want extended messages by setting the new CON_EXTENDED
flag and they'll be fed messages formatted the same way as /dev/kmsg.

",,,;\n"

If extended consoles exist, in-kernel fragment assembly is disabled. This
ensures that all messages emitted to consoles have full metadata including
sequence number. The contflag carries enough information to reassemble
the fragments from the reader side trivially. Note that this only affects
/dev/kmsg. Regular console and /proc/kmsg outputs are not affected by
this change.

* Extended message formatting for console drivers is enabled iff there
are registered extended consoles.

* Comment describing /dev/kmsg message format updated to add missing
contflag field and help distinguishing variable from verbatim terms.

Signed-off-by: Tejun Heo
Cc: David Miller
Cc: Kay Sievers
Reviewed-by: Petr Mladek
Cc: Tetsuo Handa
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2015-06-26 08:00:38 +0800
d43ff430f printk: guard the amount written per line by devkmsg_read() ... Browse Code »

This patchset updates netconsole so that it can emit messages with the
same header as used in /dev/kmsg which gives neconsole receiver full log
information which enables things like structured logging and detection
of lost messages.

This patch (of 7):

devkmsg_read() uses 8k buffer and assumes that the formatted output
message won't overrun which seems safe given LOG_LINE_MAX, the current use
of dict and the escaping method being used; however, we're planning to use
devkmsg formatting wider and accounting for the buffer size properly isn't
that complicated.

This patch defines CONSOLE_EXT_LOG_MAX as 8192 and updates devkmsg_read()
so that it limits output accordingly.

Signed-off-by: Tejun Heo
Cc: David Miller
Cc: Kay Sievers
Reviewed-by: Petr Mladek
Cc: Tetsuo Handa
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2015-06-26 08:00:38 +0800
3033f14ab clone: support passing tls argument via C rather than pt_regs magic ... Browse Code »

clone has some of the quirkiest syscall handling in the kernel, with a
pile of special cases, historical curiosities, and architecture-specific
calling conventions. In particular, clone with CLONE_SETTLS accepts a
parameter "tls" that the C entry point completely ignores and some
assembly entry points overwrite; instead, the low-level arch-specific
code pulls the tls parameter out of the arch-specific register captured
as part of pt_regs on entry to the kernel. That's a massive hack, and
it makes the arch-specific code only work when called via the specific
existing syscall entry points; because of this hack, any new clone-like
system call would have to accept an identical tls argument in exactly
the same arch-specific position, rather than providing a unified system
call entry point across architectures.

The first patch allows architectures to handle the tls argument via
normal C parameter passing, if they opt in by selecting
HAVE_COPY_THREAD_TLS. The second patch makes 32-bit and 64-bit x86 opt
into this.

These two patches came out of the clone4 series, which isn't ready for
this merge window, but these first two cleanup patches were entirely
uncontroversial and have acks. I'd like to go ahead and submit these
two so that other architectures can begin building on top of this and
opting into HAVE_COPY_THREAD_TLS. However, I'm also happy to wait and
send these through the next merge window (along with v3 of clone4) if
anyone would prefer that.

This patch (of 2):

clone with CLONE_SETTLS accepts an argument to set the thread-local
storage area for the new thread. sys_clone declares an int argument
tls_val in the appropriate point in the argument list (based on the
various CLONE_BACKWARDS variants), but doesn't actually use or pass along
that argument. Instead, sys_clone calls do_fork, which calls
copy_process, which calls the arch-specific copy_thread, and copy_thread
pulls the corresponding syscall argument out of the pt_regs captured at
kernel entry (knowing what argument of clone that architecture passes tls
in).

Apart from being awful and inscrutable, that also only works because only
one code path into copy_thread can pass the CLONE_SETTLS flag, and that
code path comes from sys_clone with its architecture-specific
argument-passing order. This prevents introducing a new version of the
clone system call without propagating the same architecture-specific
position of the tls argument.

However, there's no reason to pull the argument out of pt_regs when
sys_clone could just pass it down via C function call arguments.

Introduce a new CONFIG_HAVE_COPY_THREAD_TLS for architectures to opt into,
and a new copy_thread_tls that accepts the tls parameter as an additional
unsigned long (syscall-argument-sized) argument. Change sys_clone's tls
argument to an unsigned long (which does not change the ABI), and pass
that down to copy_thread_tls.

Architectures that don't opt into copy_thread_tls will continue to ignore
the C argument to sys_clone in favor of the pt_regs captured at kernel
entry, and thus will be unable to introduce new versions of the clone
syscall.

Patch co-authored by Josh Triplett and Thiago Macieira.

Signed-off-by: Josh Triplett
Acked-by: Andy Lutomirski
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Thiago Macieira
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2015-06-26 08:00:38 +0800
8c7fbe579 stddef.h: move offsetofend inside #ifndef/#endif guard, neaten ... Browse Code »

Commit 3876488444e7 ("include/stddef.h: Move offsetofend() from vfio.h
to a generic kernel header") added offsetofend outside the normal
include #ifndef/#endif guard. Move it inside.

Miscellanea:

o remove unnecessary blank line
o standardize offsetof macros whitespace style

Signed-off-by: Joe Perches
Cc: Denys Vlasenko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-06-26 08:00:38 +0800
b86a50c3b compiler-intel: fix wrong compiler barrier() macro ... Browse Code »

Cleanup commit 73679e508201 ("compiler-intel.h: Remove duplicate
definition") removed the double definition of __memory_barrier()
intrinsics.

However, in doing so, it also removed the preceding #undef barrier by
accident, meaning, the actual barrier() macro from compiler-gcc.h with
inline asm is still in place as __GNUC__ is provided.

Subsequently, barrier() can never be defined as __memory_barrier() from
compiler.h since it already has a definition in place and if we trust
the comment in compiler-intel.h, ecc doesn't support gcc specific asm
statements.

I don't have an ecc at hand (unsure if that's still used in the field?)
and only found this by accident during code review, a revert of that
cleanup would be simplest option.

Fixes: 73679e508201 ("compiler-intel.h: Remove duplicate definition")
Signed-off-by: Daniel Borkmann
Reviewed-by: Pranith Kumar
Cc: Pranith Kumar
Cc: H. Peter Anvin
Cc: mancha security
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel Borkmann
2015-06-26 08:00:38 +0800
cb984d101 compiler-gcc: integrate the various compiler-gcc[345].h files ... Browse Code »

As gcc major version numbers are going to advance rather rapidly in the
future, there's no real value in separate files for each compiler
version.

Deduplicate some of the macros #defined in each file too.

Neaten comments using normal kernel commenting style.

Signed-off-by: Joe Perches
Cc: Andi Kleen
Cc: Michal Marek
Cc: Segher Boessenkool
Cc: Sasha Levin
Cc: Anton Blanchard
Cc: Alan Modra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-06-26 08:00:38 +0800
f6d133f87 compiler-gcc.h: neatening ... Browse Code »

- Move the inline and noinline blocks together

- Comment neatening

- Alignment of __attribute__ uses

- Consistent naming of __must_be_array macro argument

- Multiline macro neatening

Signed-off-by: Joe Perches
Cc: Andi Kleen
Cc: Michal Marek
Cc: Segher Boessenkool
Cc: Sasha Levin
Cc: Anton Blanchard
Cc: Alan Modra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-06-26 08:00:37 +0800
479305fd7 zpool: remove zpool_evict() ... Browse Code »

Remove zpool_evict() helper function. As zbud is currently the only
zpool implementation that supports eviction, add zpool and zpool_ops
references to struct zbud_pool and directly call zpool_ops->evict(zpool,
handle) on eviction.

Currently zpool provides the zpool_evict helper which locks the zpool
list lock and searches through all pools to find the specific one
matching the caller, and call the corresponding zpool_ops->evict
function. However, this is unnecessary, as the zbud pool can simply
keep a reference to the zpool that created it, as well as the zpool_ops,
and directly call the zpool_ops->evict function, when it needs to evict
a page. This avoids a spinlock and list search in zpool for each
eviction.

Signed-off-by: Dan Streetman
Cc: Seth Jennings
Cc: Minchan Kim
Cc: Nitin Gupta
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Streetman
2015-06-26 08:00:37 +0800

25 Jun, 2015

32 commits

aefbef10e Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge first patchbomb from Andrew Morton:

- a few misc things

- ocfs2 udpates

- kernel/watchdog.c feature work (took ages to get right)

- most of MM. A few tricky bits are held up and probably won't make 4.2.

* emailed patches from Andrew Morton : (91 commits)
mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc()
mm, thp: respect MPOL_PREFERRED policy with non-local node
tmpfs: truncate prealloc blocks past i_size
mm/memory hotplug: print the last vmemmap region at the end of hot add memory
mm/mmap.c: optimization of do_mmap_pgoff function
mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan
mm: kmemleak: avoid deadlock on the kmemleak object insertion error path
mm: kmemleak: do not acquire scan_mutex in kmemleak_do_cleanup()
mm: kmemleak: fix delete_object_*() race when called on the same memory block
mm: kmemleak: allow safe memory scanning during kmemleak disabling
memcg: convert mem_cgroup->under_oom from atomic_t to int
memcg: remove unused mem_cgroup->oom_wakeups
frontswap: allow multiple backends
x86, mirror: x86 enabling - find mirrored memory ranges
mm/memblock: allocate boot time data structures from mirrored memory
mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute
mm: do not ignore mapping_gfp_mask in page cache allocation paths
mm/cma.c: fix typos in comments
mm/oom_kill.c: print points as unsigned int
mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages
...

Linus Torvalds
2015-06-25 11:47:21 +0800
cfcc0ad47 Merge tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"New features:
- per-file encryption (e.g., ext4)
- FALLOC_FL_ZERO_RANGE
- FALLOC_FL_COLLAPSE_RANGE
- RENAME_WHITEOUT

Major enhancement/fixes:
- recovery broken superblocks
- enhance f2fs_trim_fs with a discard_map
- fix a race condition on dentry block allocation
- fix a deadlock during summary operation
- fix a missing fiemap result

.. and many minor bug fixes and clean-ups were done"

* tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (83 commits)
f2fs: do not trim preallocated blocks when truncating after i_size
f2fs crypto: add alloc_bounce_page
f2fs crypto: fix to handle errors likewise ext4
f2fs: drop the volatile_write flag only
f2fs: skip committing valid superblock
f2fs: setting discard option in parse_options()
f2fs: fix to return exact trimmed size
f2fs: support FALLOC_FL_INSERT_RANGE
f2fs: hide common code in f2fs_replace_block
f2fs: disable the discard option when device doesn't support
f2fs crypto: remove alloc_page for bounce_page
f2fs: fix a deadlock for summary page lock vs. sentry_lock
f2fs crypto: clean up error handling in f2fs_fname_setup_filename
f2fs crypto: avoid f2fs_inherit_context for symlink
f2fs crypto: do not set encryption policy for non-directory by ioctl
f2fs crypto: allow setting encryption policy once
f2fs crypto: check context consistent for rename2
f2fs: avoid duplicated code by reusing f2fs_read_end_io
f2fs crypto: use per-inode tfm structure
f2fs: recovering broken superblock during mount
...

Linus Torvalds
2015-06-25 11:38:29 +0800
14738e033 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input ... Browse Code »

Pull input subsystem updates from Dmitry Torokhov:
"Thanks to Samuel Thibault input device (keyboard) LEDs are no longer
hardwired within the input core but use LED subsystem and so allow use
of different triggers; Hans de Goede did a large update for the ALPS
touchpad driver; we have new TI drv2665 haptics driver and DA9063
OnKey driver, and host of other drivers got various fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (55 commits)
Input: pixcir_i2c_ts - fix receive error
MAINTAINERS: remove non existent input mt git tree
Input: improve usage of gpiod API
tty/vt/keyboard: define LED triggers for VT keyboard lock states
tty/vt/keyboard: define LED triggers for VT LED states
Input: export LEDs as class devices in sysfs
Input: cyttsp4 - use swap() in cyttsp4_get_touch()
Input: goodix - do not explicitly set evbits in input device
Input: goodix - export id and version read from device
Input: goodix - fix variable length array warning
Input: goodix - fix alignment issues
Input: add OnKey driver for DA9063 MFD part
Input: elan_i2c - add product IDs FW names
Input: elan_i2c - add support for multi IC type and iap format
Input: focaltech - report finger width to userspace
tty: remove platform_sysrq_reset_seq
Input: synaptics_i2c - use proper boolean values
Input: psmouse - use true instead of 1 for boolean values
Input: cyapa - fix a few typos in comments
Input: stmpe-ts - enforce device tree only mode
...

Linus Torvalds
2015-06-25 10:56:58 +0800
93a4b1b94 Merge tag 'pinctrl-v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl ... Browse Code »

Pull pin control updates from Linus Walleij:
"Here is the bulk of pin control changes for the v4.2 series: Quite a
lot of new SoC subdrivers and two new main drivers this time, apart
from that business as usual.

Details:

Core functionality:
- Enable exclusive pin ownership: it is possible to flag a pin
controller so that GPIO and other functions cannot use a single pin
simultaneously.

New drivers:
- NXP LPC18xx System Control Unit pin controller
- Imagination Pistachio SoC pin controller

New subdrivers:
- Freescale i.MX7d SoC
- Intel Sunrisepoint-H PCH
- Renesas PFC R8A7793
- Renesas PFC R8A7794
- Mediatek MT6397, MT8127
- SiRF Atlas 7
- Allwinner A33
- Qualcomm MSM8660
- Marvell Armada 395
- Rockchip RK3368

Cleanups:
- A big cleanup of the Marvell MVEBU driver rectifying it to
correspond to reality
- Drop platform device probing from the SH PFC driver, we are now a
DT only shop for SuperH
- Drop obsolte multi-platform check for SH PFC
- Various janitorial: constification, grammar etc

Improvements:
- The AT91 GPIO portions now supports the set_multiple() feature
- Split out SPI pins on the Xilinx Zynq
- Support DTs without specific function nodes in the i.MX driver"

* tag 'pinctrl-v4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (99 commits)
pinctrl: rockchip: add support for the rk3368
pinctrl: rockchip: generalize perpin driver-strength setting
pinctrl: sh-pfc: r8a7794: add SDHI pin groups
pinctrl: sh-pfc: r8a7794: add MMCIF pin groups
pinctrl: sh-pfc: add R8A7794 PFC support
pinctrl: make pinctrl_register() return proper error code
pinctrl: mvebu: armada-39x: add support for Armada 395 variant
pinctrl: mvebu: armada-39x: add missing SATA functions
pinctrl: mvebu: armada-39x: add missing PCIe functions
pinctrl: mvebu: armada-38x: add ptp functions
pinctrl: mvebu: armada-38x: add ua1 functions
pinctrl: mvebu: armada-38x: add nand functions
pinctrl: mvebu: armada-38x: add sata functions
pinctrl: mvebu: armada-xp: add dram functions
pinctrl: mvebu: armada-xp: add nand rb function
pinctrl: mvebu: armada-xp: add spi1 function
pinctrl: mvebu: armada-39x: normalize ref clock naming
pinctrl: mvebu: armada-xp: rename spi to spi0
pinctrl: mvebu: armada-370: align spi1 clock pin naming
pinctrl: mvebu: armada-370: align VDD cpu-pd pin naming with datasheet
...

Linus Torvalds
2015-06-25 10:21:02 +0800
d59b92f93 Merge tag 'backlight-for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight ... Browse Code »

Pull backlight updates from Lee Jones:
"Changes to existing drivers:

- supply MODULE_DEVICE_TABLE() to ensure probing
- constify struct; da9052_bl
- enable compile test; lcd_l4f00242t03, lcd_lms283fg05, backlight_gpio
- suspend/resume bugfix; lp855x_bl
- devm_gpiod_get_optional() API fixup; pwm_bl
- error handling fixup; backlight"

* tag 'backlight-for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
backlight: Change the return type of backlight_update_status() to int
backlight: pwm_bl: Simplify usage of devm_gpiod_get_optional
backlight: lp855x: Don't clear level on suspend/blank
backlight: Allow compile test of GPIO consumers if !GPIOLIB
video: backlight: da9052: Constify platform_device_id
gpio-backlight: Discover driver during boot time

Linus Torvalds
2015-06-25 09:57:00 +0800
8a8c35fad mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc() ... Browse Code »

Beginning at commit d52d3997f843 ("ipv6: Create percpu rt6_info"), the
following INFO splat is logged:

===============================
[ INFO: suspicious RCU usage. ]
4.1.0-rc7-next-20150612 #1 Not tainted
-------------------------------
kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
3 locks held by systemd/1:
#0: (rtnl_mutex){+.+.+.}, at: [] rtnetlink_rcv+0x1f/0x40
#1: (rcu_read_lock_bh){......}, at: [] ipv6_add_addr+0x62/0x540
#2: (addrconf_hash_lock){+...+.}, at: [] ipv6_add_addr+0x184/0x540
stack backtrace:
CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
Call Trace:
dump_stack+0x4c/0x6e
lockdep_rcu_suspicious+0xe7/0x120
___might_sleep+0x1d5/0x1f0
__might_sleep+0x4d/0x90
kmem_cache_alloc+0x47/0x250
create_object+0x39/0x2e0
kmemleak_alloc_percpu+0x61/0xe0
pcpu_alloc+0x370/0x630

Additional backtrace lines are truncated. In addition, the above splat
is followed by several "BUG: sleeping function called from invalid
context at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau,
these are the clue to the fix. Routine kmemleak_alloc_percpu() always
uses GFP_KERNEL for its allocations, whereas it should follow the gfp
from its callers.

Reviewed-by: Catalin Marinas
Reviewed-by: Kamalesh Babulal
Acked-by: Martin KaFai Lau
Signed-off-by: Larry Finger
Cc: Martin KaFai Lau
Cc: Catalin Marinas
Cc: Tejun Heo
Cc: Christoph Lameter
Cc: [3.18+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Larry Finger
2015-06-25 08:49:46 +0800
d1dc6f1bc frontswap: allow multiple backends ... Browse Code »

Change frontswap single pointer to a singly linked list of frontswap
implementations. Update Xen tmem implementation as register no longer
returns anything.

Frontswap only keeps track of a single implementation; any
implementation that registers second (or later) will replace the
previously registered implementation, and gets a pointer to the previous
implementation that the new implementation is expected to pass all
frontswap functions to if it can't handle the function itself. However
that method doesn't really make much sense, as passing that work on to
every implementation adds unnecessary work to implementations; instead,
frontswap should simply keep a list of all registered implementations
and try each implementation for any function. Most importantly, neither
of the two currently existing frontswap implementations in the kernel
actually do anything with any previous frontswap implementation that
they replace when registering.

This allows frontswap to successfully manage multiple implementations by
keeping a list of them all.

Signed-off-by: Dan Streetman
Cc: Konrad Rzeszutek Wilk
Cc: Boris Ostrovsky
Cc: David Vrabel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Streetman
2015-06-25 08:49:45 +0800
b05b9f5f9 x86, mirror: x86 enabling - find mirrored memory ranges ... Browse Code »

UEFI GetMemoryMap() uses a new attribute bit to mark mirrored memory
address ranges. See UEFI 2.5 spec pages 157-158:

http://www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf

On EFI enabled systems scan the memory map and tell memblock about any
mirrored ranges.

Signed-off-by: Tony Luck
Cc: Xishi Qiu
Cc: Hanjun Guo
Cc: Xiexiuqi
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Naoya Horiguchi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Luck
2015-06-25 08:49:45 +0800
a3f5bafcc mm/memblock: allocate boot time data structures from mirrored memory ... Browse Code »

Try to allocate all boot time kernel data structures from mirrored
memory.

If we run out of mirrored memory print warnings, but fall back to using
non-mirrored memory to make sure that we still boot.

By number of bytes, most of what we allocate at boot time is the page
structures. 64 bytes per 4K page on x86_64 ... or about 1.5% of total
system memory. For workloads where the bulk of memory is allocated to
applications this may represent a useful improvement to system
availability since 1.5% of total memory might be a third of the memory
allocated to the kernel.

Signed-off-by: Tony Luck
Cc: Xishi Qiu
Cc: Hanjun Guo
Cc: Xiexiuqi
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Naoya Horiguchi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Luck
2015-06-25 08:49:45 +0800
fc6daaf93 mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute ... Browse Code »

Some high end Intel Xeon systems report uncorrectable memory errors as a
recoverable machine check. Linux has included code for some time to
process these and just signal the affected processes (or even recover
completely if the error was in a read only page that can be replaced by
reading from disk).

But we have no recovery path for errors encountered during kernel code
execution. Except for some very specific cases were are unlikely to ever
be able to recover.

Enter memory mirroring. Actually 3rd generation of memory mirroing.

Gen1: All memory is mirrored
Pro: No s/w enabling - h/w just gets good data from other side of the
mirror
Con: Halves effective memory capacity available to OS/applications

Gen2: Partial memory mirror - just mirror memory begind some memory controllers
Pro: Keep more of the capacity
Con: Nightmare to enable. Have to choose between allocating from
mirrored memory for safety vs. NUMA local memory for performance

Gen3: Address range partial memory mirror - some mirror on each memory
controller
Pro: Can tune the amount of mirror and keep NUMA performance
Con: I have to write memory management code to implement

The current plan is just to use mirrored memory for kernel allocations.
This has been broken into two phases:

1) This patch series - find the mirrored memory, use it for boot time
allocations

2) Wade into mm/page_alloc.c and define a ZONE_MIRROR to pick up the
unused mirrored memory from mm/memblock.c and only give it out to
select kernel allocations (this is still being scoped because
page_alloc.c is scary).

This patch (of 3):

Add extra "flags" to memblock to allow selection of memory based on
attribute. No functional changes

Signed-off-by: Tony Luck
Cc: Xishi Qiu
Cc: Hanjun Guo
Cc: Xiexiuqi
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Naoya Horiguchi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tony Luck
2015-06-25 08:49:44 +0800
8809aa2d2 mm: clarify that the function operates on hugepage pte ... Browse Code »

We have confusing functions to clear pmd, pmd_clear_* and pmd_clear. Add
_huge_ to pmdp_clear functions so that we are clear that they operate on
hugepage pte.

We don't bother about other functions like pmdp_set_wrprotect,
pmdp_clear_flush_young, because they operate on PTE bits and hence
indicate they are operating on hugepage ptes

Signed-off-by: Aneesh Kumar K.V
Acked-by: Kirill A. Shutemov
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Andrea Arcangeli
Cc: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2015-06-25 08:49:44 +0800
f28b6ff8c powerpc/mm: use generic version of pmdp_clear_flush() ... Browse Code »

Also move the pmd_trans_huge check to generic code.

Signed-off-by: Aneesh Kumar K.V
Acked-by: Kirill A. Shutemov
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Andrea Arcangeli
Cc: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2015-06-25 08:49:44 +0800
15a25b2ea mm/thp: split out pmd collapse flush into separate functions ... Browse Code »

Architectures like ppc64 [1] need to do special things while clearing pmd
before a collapse. For them this operation is largely different from a
normal hugepage pte clear. Hence add a separate function to clear pmd
before collapse. After this patch pmdp_* functions operate only on
hugepage pte, and not on regular pmd_t values pointing to page table.

[1] ppc64 needs to invalidate all the normal page pte mappings we already
have inserted in the hardware hash page table. But before doing that we
need to make sure there are no parallel hash page table insert going on.
So we need to do a kick_all_cpus_sync() before flushing the older hash
table entries. By moving this to a separate function we capture these
details and mention how it is different from a hugepage pte clear.

This patch is a cleanup and only does code movement for clarity. There
should not be any change in functionality.

Signed-off-by: Aneesh Kumar K.V
Acked-by: Kirill A. Shutemov
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Andrea Arcangeli
Cc: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2015-06-25 08:49:44 +0800
97f0b1345 tracing: add trace event for memory-failure ... Browse Code »

RAS user space tools like rasdaemon which base on trace event, could
receive mce error event, but no memory recovery result event. So, I want
to add this event to make this scenario complete.

This patch add a event at ras group for memory-failure.

The output like below:
# tracer: nop
#
# entries-in-buffer/entries-written: 2/2 #P:24
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
mce-inject-13150 [001] .... 277.019359: memory_failure_event: pfn 0x19869: recovery action for free buddy page: Delayed

[xiexiuqi@huawei.com: fix build error]
Signed-off-by: Xie XiuQi
Reviewed-by: Naoya Horiguchi
Acked-by: Steven Rostedt
Cc: Tony Luck
Cc: Chen Gong
Cc: Jim Davis
Signed-off-by: Xie XiuQi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xie XiuQi
2015-06-25 08:49:43 +0800
cc3e2af42 memory-failure: change type of action_result's param 3 to enum ... Browse Code »

Change type of action_result's param 3 to enum for type consistency,
and rename mf_outcome to mf_result for clearly.

Signed-off-by: Xie XiuQi
Acked-by: Naoya Horiguchi
Cc: Chen Gong
Cc: Jim Davis
Cc: Steven Rostedt
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xie XiuQi
2015-06-25 08:49:43 +0800
cc637b170 memory-failure: export page_type and action result ... Browse Code »

Export 'outcome' and 'action_page_type' to mm.h, so we could use
this emnus outside.

This patch is preparation for adding trace events for memory-failure
recovery action.

Signed-off-by: Xie XiuQi
Acked-by: Naoya Horiguchi
Cc: Chen Gong
Cc: Jim Davis
Cc: Steven Rostedt
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xie XiuQi
2015-06-25 08:49:43 +0800
dc56401fc mm: oom_kill: simplify OOM killer locking ... Browse Code »

The zonelist locking and the oom_sem are two overlapping locks that are
used to serialize global OOM killing against different things.

The historical zonelist locking serializes OOM kills from allocations with
overlapping zonelists against each other to prevent killing more tasks
than necessary in the same memory domain. Only when neither tasklists nor
zonelists from two concurrent OOM kills overlap (tasks in separate memcgs
bound to separate nodes) are OOM kills allowed to execute in parallel.

The younger oom_sem is a read-write lock to serialize OOM killing against
the PM code trying to disable the OOM killer altogether.

However, the OOM killer is a fairly cold error path, there is really no
reason to optimize for highly performant and concurrent OOM kills. And
the oom_sem is just flat-out redundant.

Replace both locking schemes with a single global mutex serializing OOM
kills regardless of context.

Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: David Rientjes
Cc: Tetsuo Handa
Cc: Andrea Arcangeli
Cc: Dave Chinner
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2015-06-25 08:49:43 +0800
16e951966 mm: oom_kill: clean up victim marking and exiting interfaces ... Browse Code »

Rename unmark_oom_victim() to exit_oom_victim(). Marking and unmarking
are related in functionality, but the interface is not symmetrical at
all: one is an internal OOM killer function used during the killing, the
other is for an OOM victim to signal its own death on exit later on.
This has locking implications, see follow-up changes.

While at it, rename mark_tsk_oom_victim() to mark_oom_victim(), which
is easier on the eye.

Signed-off-by: Johannes Weiner
Acked-by: David Rientjes
Acked-by: Michal Hocko
Cc: Tetsuo Handa
Cc: Andrea Arcangeli
Cc: Dave Chinner
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2015-06-25 08:49:43 +0800
ead07f6a8 mm/memory-failure: introduce get_hwpoison_page() for consistent refcount handling ... Browse Code »

memory_failure() can run in 2 different mode (specified by
MF_COUNT_INCREASED) in page refcount perspective. When
MF_COUNT_INCREASED is set, memory_failure() assumes that the caller
takes a refcount of the target page. And if cleared, memory_failure()
takes it in it's own.

In current code, however, refcounting is done differently in each caller.
For example, madvise_hwpoison() uses get_user_pages_fast() and
hwpoison_inject() uses get_page_unless_zero(). So this inconsistent
refcounting causes refcount failure especially for thp tail pages.
Typical user visible effects are like memory leak or
VM_BUG_ON_PAGE(!page_count(page)) in isolate_lru_page().

To fix this refcounting issue, this patch introduces get_hwpoison_page()
to handle thp tail pages in the same manner for each caller of hwpoison
code.

memory_failure() might fail to split thp and in such case it returns
without completing page isolation. This is not good because PageHWPoison
on the thp is still set and there's no easy way to unpoison such thps. So
this patch try to roll back any action to the thp in "non anonymous thp"
case and "thp split failed" case, expecting an MCE(SRAR) generated by
later access afterward will properly free such thps.

[akpm@linux-foundation.org: fix CONFIG_HWPOISON_INJECT=m]
Signed-off-by: Naoya Horiguchi
Cc: Andi Kleen
Cc: Tony Luck
Cc: "Kirill A. Shutemov"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Naoya Horiguchi
2015-06-25 08:49:42 +0800
c761471b5 mm: avoid tail page refcounting on non-THP compound pages ... Browse Code »

Reintroduce 8d63d99a5dfb ("mm: avoid tail page refcounting on non-THP
compound pages") after removing bogus VM_BUG_ON_PAGE() in
put_unrefcounted_compound_page().

THP uses tail page refcounting to be able to split huge pages at any time.
Tail page refcounting is not needed for other users of compound pages and
it's harmful because of overhead.

We try to exclude non-THP pages from tail page refcounting using
__compound_tail_refcounted() check. It excludes most common non-THP
compound pages: SL*B and hugetlb, but it doesn't catch rest of __GFP_COMP
users -- drivers.

And it's not only about overhead.

Drivers might want to use compound pages to get refcounting semantics
suitable for mapping high-order pages to userspace. But tail page
refcounting breaks it.

Tail page refcounting uses ->_mapcount in tail pages to store GUP pins on
them. It means GUP pins would affect page_mapcount() for tail pages.
It's not a problem for THP, because it never maps tail pages. But unlike
THP, drivers map parts of compound pages with PTEs and it makes
page_mapcount() be called for tail pages.

In particular, GUP pins would shift PSS up and affect /proc/kpagecount for
such pages. But, I'm not aware about anything which can lead to crash or
other serious misbehaviour.

Since currently all THP pages are anonymous and all drivers pages are not,
we can fix the __compound_tail_refcounted() check by requiring PageAnon()
to enable tail page refcounting.

Signed-off-by: Kirill A. Shutemov
Acked-by: Hugh Dickins
Reviewed-by: Andrea Arcangeli
Reported-by: Borislav Petkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2015-06-25 08:49:42 +0800
a9919c793 mm: only define hashdist variable when needed ... Browse Code »

For !CONFIG_NUMA, hashdist will always be 0, since it's setter is
otherwise compiled out. So we can save 4 bytes of data and some .text
(although mostly in __init functions) by only defining it for
CONFIG_NUMA.

Signed-off-by: Rasmus Villemoes
Acked-by: David Rientjes
Reviewed-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-06-25 08:49:41 +0800
4abad2ca4 mm: new arch_remap() hook ... Browse Code »

Some architectures would like to be triggered when a memory area is moved
through the mremap system call.

This patch introduces a new arch_remap() mm hook which is placed in the
path of mremap, and is called before the old area is unmapped (and the
arch_unmap() hook is called).

Signed-off-by: Laurent Dufour
Cc: "Kirill A. Shutemov"
Cc: Hugh Dickins
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Pavel Emelyanov
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Laurent Dufour
2015-06-25 08:49:41 +0800
2ae416b14 mm: new mm hook framework ... Browse Code »

CRIU is recreating the process memory layout by remapping the checkpointee
memory area on top of the current process (criu). This includes remapping
the vDSO to the place it has at checkpoint time.

However some architectures like powerpc are keeping a reference to the
vDSO base address to build the signal return stack frame by calling the
vDSO sigreturn service. So once the vDSO has been moved, this reference
is no more valid and the signal frame built later are not usable.

This patch serie is introducing a new mm hook framework, and a new
arch_remap hook which is called when mremap is done and the mm lock still
hold. The next patch is adding the vDSO remap and unmap tracking to the
powerpc architecture.

This patch (of 3):

This patch introduces a new set of header file to manage mm hooks:
- per architecture empty header file (arch/x/include/asm/mm-arch-hooks.h)
- a generic header (include/linux/mm-arch-hooks.h)

The architecture which need to overwrite a hook as to redefine it in its
header file, while architecture which doesn't need have nothing to do.

The default hooks are defined in the generic header and are used in the
case the architecture is not defining it.

In a next step, mm hooks defined in include/asm-generic/mm_hooks.h should
be moved here.

Signed-off-by: Laurent Dufour
Suggested-by: Andrew Morton
Cc: "Kirill A. Shutemov"
Cc: Hugh Dickins
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Pavel Emelyanov
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Laurent Dufour
2015-06-25 08:49:41 +0800
1ed58b605 linux/slab.h: fix three off-by-one typos in comment ... Browse Code »

The first is a keyboard-off-by-one, the other two the ordinary mathy kind.

Signed-off-by: Rasmus Villemoes
Acked-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-06-25 08:49:41 +0800
4066c33d0 mm/slab_common: support the slub_debug boot option on specific object size ... Browse Code »

The slub_debug=PU,kmalloc-xx cannot work because in the
create_kmalloc_caches() the s->name is created after the
create_kmalloc_cache() is called. The name is NULL in the
create_kmalloc_cache() so the kmem_cache_flags() would not set the
slub_debug flags to the s->flags. The fix here set up a kmalloc_names
string array for the initialization purpose and delete the dynamic name
creation of kmalloc_caches.

[akpm@linux-foundation.org: s/kmalloc_names/kmalloc_info/, tweak comment text]
Signed-off-by: Gavin Guo
Acked-by: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gavin Guo
2015-06-25 08:49:40 +0800
fe4ba3c34 watchdog: add watchdog_cpumask sysctl to assist nohz ... Browse Code »

Change the default behavior of watchdog so it only runs on the
housekeeping cores when nohz_full is enabled at build and boot time.
Allow modifying the set of cores the watchdog is currently running on
with a new kernel.watchdog_cpumask sysctl.

In the current system, the watchdog subsystem runs a periodic timer that
schedules the watchdog kthread to run. However, nohz_full cores are
designed to allow userspace application code running on those cores to
have 100% access to the CPU. So the watchdog system prevents the
nohz_full application code from being able to run the way it wants to,
thus the motivation to suppress the watchdog on nohz_full cores, which
this patchset provides by default.

However, if we disable the watchdog globally, then the housekeeping
cores can't benefit from the watchdog functionality. So we allow
disabling it only on some cores. See Documentation/lockup-watchdogs.txt
for more information.

[jhubbard@nvidia.com: fix a watchdog crash in some configurations]
Signed-off-by: Chris Metcalf
Acked-by: Don Zickus
Cc: Ingo Molnar
Cc: Ulrich Obergfell
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Signed-off-by: John Hubbard
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chris Metcalf
2015-06-25 08:49:40 +0800
b5242e98c smpboot: allow excluding cpus from the smpboot threads ... Browse Code »

This patch series allows the watchdog to run by default only on the
housekeeping cores when nohz_full is in effect; this seems to be a good
compromise short of turning it off completely (since the nohz_full cores
can't tolerate a watchdog).

To provide customizability, we add /proc/sys/kernel/watchdog_cpumask so
that the set of cores running the watchdog can be tuned to different
values after bootup.

To implement this customizability, we add a new
smpboot_update_cpumask_percpu_thread() API to the smpboot_thread
subsystem that lets us park or unpark "unwanted" threads.

And now that threads can be parked for long periods of time, we tweak the
/proc//stat and /proc//status code so parked threads aren't
reported as running, which is otherwise confusing.

This patch (of 3):

This change allows some cores to be excluded from running the
smp_hotplug_thread tasks. The following commit to update
kernel/watchdog.c to use this functionality is the motivating example, and
more information on the motivation is provided there.

A new smp_hotplug_thread field is introduced, "cpumask", which is cpumask
field managed by the smpboot subsystem that indicates whether or not the
given smp_hotplug_thread should run on that core; the cpumask is checked
when deciding whether to unpark the thread.

To limit the cpumask to less than cpu_possible, you must call
smpboot_update_cpumask_percpu_thread() after registering.

Signed-off-by: Chris Metcalf
Cc: Don Zickus
Cc: Ingo Molnar
Cc: Ulrich Obergfell
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chris Metcalf
2015-06-25 08:49:40 +0800
5286d20c4 configfs: unexport/make static config_item_init() ... Browse Code »

config_item_init() is only used in item.c

Signed-off-by: Fabian Frederick
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2015-06-25 08:49:39 +0800
c3cddc4c2 fsnotify: remove obsolete documentation ... Browse Code »

should_send_event is no longer part of struct fsnotify_ops, so remove it.

Signed-off-by: Nikolay Borisov
Reviewed-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nikolay Borisov
2015-06-25 08:49:38 +0800
e0456717e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Add TX fast path in mac80211, from Johannes Berg.

2) Add TSO/GRO support to ibmveth, from Thomas Falcon

3) Move away from cached routes in ipv6, just like ipv4, from Martin
KaFai Lau.

4) Lots of new rhashtable tests, from Thomas Graf.

5) Run ingress qdisc lockless, from Alexei Starovoitov.

6) Allow servers to fetch TCP packet headers for SYN packets of new
connections, for fingerprinting. From Eric Dumazet.

7) Add mode parameter to pktgen, for testing receive. From Alexei
Starovoitov.

8) Cache access optimizations via simplifications of build_skb(), from
Alexander Duyck.

9) Move page frag allocator under mm/, also from Alexander.

10) Add xmit_more support to hv_netvsc, from KY Srinivasan.

11) Add a counter guard in case we try to perform endless reclassify
loops in the packet scheduler.

12) Extern flow dissector to be programmable and use it in new "Flower"
classifier. From Jiri Pirko.

13) AF_PACKET fanout rollover fixes, performance improvements, and new
statistics. From Willem de Bruijn.

14) Add netdev driver for GENEVE tunnels, from John W Linville.

15) Add ingress netfilter hooks and filtering, from Pablo Neira Ayuso.

16) Fix handling of epoll edge triggers in TCP, from Eric Dumazet.

17) Add an ECN retry fallback for the initial TCP handshake, from Daniel
Borkmann.

18) Add tail call support to BPF, from Alexei Starovoitov.

19) Add several pktgen helper scripts, from Jesper Dangaard Brouer.

20) Add zerocopy support to AF_UNIX, from Hannes Frederic Sowa.

21) Favor even port numbers for allocation to connect() requests, and
odd port numbers for bind(0), in an effort to help avoid
ip_local_port_range exhaustion. From Eric Dumazet.

22) Add Cavium ThunderX driver, from Sunil Goutham.

23) Allow bpf programs to access skb_iif and dev->ifindex SKB metadata,
from Alexei Starovoitov.

24) Add support for T6 chips in cxgb4vf driver, from Hariprasad Shenai.

25) Double TCP Small Queues default to 256K to accomodate situations
like the XEN driver and wireless aggregation. From Wei Liu.

26) Add more entropy inputs to flow dissector, from Tom Herbert.

27) Add CDG congestion control algorithm to TCP, from Kenneth Klette
Jonassen.

28) Convert ipset over to RCU locking, from Jozsef Kadlecsik.

29) Track and act upon link status of ipv4 route nexthops, from Andy
Gospodarek.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1670 commits)
bridge: vlan: flush the dynamically learned entries on port vlan delete
bridge: multicast: add a comment to br_port_state_selection about blocking state
net: inet_diag: export IPV6_V6ONLY sockopt
stmmac: troubleshoot unexpected bits in des0 & des1
net: ipv4 sysctl option to ignore routes when nexthop link is down
net: track link-status of ipv4 nexthops
net: switchdev: ignore unsupported bridge flags
net: Cavium: Fix MAC address setting in shutdown state
drivers: net: xgene: fix for ACPI support without ACPI
ip: report the original address of ICMP messages
net/mlx5e: Prefetch skb data on RX
net/mlx5e: Pop cq outside mlx5e_get_cqe
net/mlx5e: Remove mlx5e_cq.sqrq back-pointer
net/mlx5e: Remove extra spaces
net/mlx5e: Avoid TX CQE generation if more xmit packets expected
net/mlx5e: Avoid redundant dev_kfree_skb() upon NOP completion
net/mlx5e: Remove re-assignment of wq type in mlx5e_enable_rq()
net/mlx5e: Use skb_shinfo(skb)->gso_segs rather than counting them
net/mlx5e: Static mapping of netdev priv resources to/from netdev TX queues
net/mlx4_en: Use HW counters for rx/tx bytes/packets in PF device
...

Linus Torvalds
2015-06-25 07:49:49 +0800
98ec21a01 Merge branch 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler updates from Thomas Gleixner:
"This series of scheduler updates depends on sched/core and timers/core
branches, which are already in your tree:

- Scheduler balancing overhaul to plug a hard to trigger race which
causes an oops in the balancer (Peter Zijlstra)

- Lockdep updates which are related to the balancing updates (Peter
Zijlstra)"

* 'sched-hrtimers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched,lockdep: Employ lock pinning
lockdep: Implement lock pinning
lockdep: Simplify lock_release()
sched: Streamline the task migration locking a little
sched: Move code around
sched,dl: Fix sched class hopping CBS hole
sched, dl: Convert switched_{from, to}_dl() / prio_changed_dl() to balance callbacks
sched,dl: Remove return value from pull_dl_task()
sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks
sched,rt: Remove return value from pull_rt_task()
sched: Allow balance callbacks for check_class_changed()
sched: Use replace normalize_task() with __sched_setscheduler()
sched: Replace post_schedule with a balance callback list

Linus Torvalds
2015-06-25 06:09:40 +0800
e3d8238d7 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux ... Browse Code »

Pull arm64 updates from Catalin Marinas:
"Mostly refactoring/clean-up:

- CPU ops and PSCI (Power State Coordination Interface) refactoring
following the merging of the arm64 ACPI support, together with
handling of Trusted (secure) OS instances

- Using fixmap for permanent FDT mapping, removing the initial dtb
placement requirements (within 512MB from the start of the kernel
image). This required moving the FDT self reservation out of the
memreserve processing

- Idmap (1:1 mapping used for MMU on/off) handling clean-up

- Removing flush_cache_all() - not safe on ARM unless the MMU is off.
Last stages of CPU power down/up are handled by firmware already

- "Alternatives" (run-time code patching) refactoring and support for
immediate branch patching, GICv3 CPU interface access

- User faults handling clean-up

And some fixes:

- Fix for VDSO building with broken ELF toolchains

- Fix another case of init_mm.pgd usage for user mappings (during
ASID roll-over broadcasting)

- Fix for FPSIMD reloading after CPU hotplug

- Fix for missing syscall trace exit

- Workaround for .inst asm bug

- Compat fix for switching the user tls tpidr_el0 register"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
arm64: use private ratelimit state along with show_unhandled_signals
arm64: show unhandled SP/PC alignment faults
arm64: vdso: work-around broken ELF toolchains in Makefile
arm64: kernel: rename __cpu_suspend to keep it aligned with arm
arm64: compat: print compat_sp instead of sp
arm64: mm: Fix freeing of the wrong memmap entries with !SPARSEMEM_VMEMMAP
arm64: entry: fix context tracking for el0_sp_pc
arm64: defconfig: enable memtest
arm64: mm: remove reference to tlb.S from comment block
arm64: Do not attempt to use init_mm in reset_context()
arm64: KVM: Switch vgic save/restore to alternative_insn
arm64: alternative: Introduce feature for GICv3 CPU interface
arm64: psci: fix !CONFIG_HOTPLUG_CPU build warning
arm64: fix bug for reloading FPSIMD state after CPU hotplug.
arm64: kernel thread don't need to save fpsimd context.
arm64: fix missing syscall trace exit
arm64: alternative: Work around .inst assembler bugs
arm64: alternative: Merge alternative-asm.h into alternative.h
arm64: alternative: Allow immediate branch as alternative instruction
arm64: Rework alternate sequence for ARM erratum 845719
...

Linus Torvalds
2015-06-25 01:02:15 +0800