Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

09 Apr, 2014

9 commits

75ff24fa5 Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"Highlights:
- server-side nfs/rdma fixes from Jeff Layton and Tom Tucker
- xdr fixes (a larger xdr rewrite has been posted but I decided it
would be better to queue it up for 3.16).
- miscellaneous fixes and cleanup from all over (thanks especially to
Kinglong Mee)"

* 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits)
nfsd4: don't create unnecessary mask acl
nfsd: revert v2 half of "nfsd: don't return high mode bits"
nfsd4: fix memory leak in nfsd4_encode_fattr()
nfsd: check passed socket's net matches NFSd superblock's one
SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed
NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp
SUNRPC: New helper for creating client with rpc_xprt
NFSD: Free backchannel xprt in bc_destroy
NFSD: Clear wcc data between compound ops
nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+
nfsd4: fix nfs4err_resource in 4.1 case
nfsd4: fix setclientid encode size
nfsd4: remove redundant check from nfsd4_check_resp_size
nfsd4: use more generous NFS4_ACL_MAX
nfsd4: minor nfsd4_replay_cache_entry cleanup
nfsd4: nfsd4_replay_cache_entry should be static
nfsd4: update comments with obsolete function name
rpc: Allow xdr_buf_subsegment to operate in-place
NFSD: Using free_conn free connection
SUNRPC: fix memory leak of peer addresses in XPRT
...

Linus Torvalds
2014-04-09 09:28:14 +0800
ffddc5fd1 fs/ncpfs/dir.c: fix indenting in ncp_lookup() ... Browse Code »

My static checker suggests adding curly braces here. Probably that was
the intent, but actually the code works the same either way. I've just
changed the indenting and left the code as-is.

Signed-off-by: Dan Carpenter
Cc: Petr Vandrovec
Acked-by: Dave Chiluk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Carpenter
2014-04-09 07:48:53 +0800
15a03ac6f ncpfs/inode.c: fix mismatch printk formats and arguments ... Browse Code »

Conversions to ncp_dbg showed some format/argument mismatches so fix
them.

Signed-off-by: Joe Perches
Cc: Petr Vandrovec
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-04-09 07:48:53 +0800
485b47f68 ncpfs: remove now unused PRINTK macro ... Browse Code »

Uses are gone, remove the macro.

Signed-off-by: Joe Perches
Cc: Petr Vandrovec
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-04-09 07:48:52 +0800
e45ca8baa ncpfs: convert PPRINTK to ncp_vdbg ... Browse Code »

Use a more current logging style.

Convert the paranoia debug statement to vdbg.
Remove the embedded function names as dynamic_debug can do that.

Signed-off-by: Joe Perches
Cc: Petr Vandrovec
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-04-09 07:48:52 +0800
d3b73ca1b ncpfs: convert DPRINTK/DDPRINTK to ncp_dbg ... Browse Code »

Use a more current logging style and enable use of dynamic debugging.

Remove embedded function names, dynamic debug can add this instead.

Signed-off-by: Joe Perches
Cc: Petr Vandrovec
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-04-09 07:48:52 +0800
b41f8b84d ncpfs: Add pr_fmt and convert printks to pr_<level> ... Browse Code »

Convert to a more current logging style.

Add pr_fmt to prefix with "ncpfs: ".
Remove the embedded function names and use "%s: ", __func__

Some previously unprefixed messages now have "ncpfs: "

Signed-off-by: Joe Perches
Cc: Petr Vandrovec
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-04-09 07:48:52 +0800
e53d77eb8 autofs4: check dev ioctl size before allocating ... Browse Code »

There wasn't any check of the size passed from userspace before trying
to allocate the memory required.

This meant that userspace might request more space than allowed,
triggering an OOM.

Signed-off-by: Sasha Levin
Signed-off-by: Ian Kent
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2014-04-09 07:48:51 +0800
e9f37d3a8 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux ... Browse Code »

Pull drm updates from Dave Airlie:
"Highlights:

- drm:

Generic display port aux features, primary plane support, drm
master management fixes, logging cleanups, enforced locking checks
(instead of docs), documentation improvements, minor number
handling cleanup, pseudofs for shared inodes.

- ttm:

add ability to allocate from both ends

- i915:

broadwell features, power domain and runtime pm, per-process
address space infrastructure (not enabled)

- msm:

power management, hdmi audio support

- nouveau:

ongoing GPU fault recovery, initial maxwell support, random fixes

- exynos:

refactored driver to clean up a lot of abstraction, DP support
moved into drm, LVDS bridge support added, parallel panel support

- gma500:

SGX MMU support, SGX irq handling, asle irq work fixes

- radeon:

video engine bringup, ring handling fixes, use dp aux helpers

- vmwgfx:

add rendernode support"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (849 commits)
DRM: armada: fix corruption while loading cursors
drm/dp_helper: don't return EPROTO for defers (v2)
drm/bridge: export ptn3460_init function
drm/exynos: remove MODULE_DEVICE_TABLE definitions
ARM: dts: exynos4412-trats2: enable exynos/fimd node
ARM: dts: exynos4210-trats: enable exynos/fimd node
ARM: dts: exynos4412-trats2: add panel node
ARM: dts: exynos4210-trats: add panel node
ARM: dts: exynos4: add MIPI DSI Master node
drm/panel: add S6E8AA0 driver
ARM: dts: exynos4210-universal_c210: add proper panel node
drm/panel: add ld9040 driver
panel/ld9040: add DT bindings
panel/s6e8aa0: add DT bindings
drm/exynos: add DSIM driver
exynos/dsim: add DT bindings
drm/exynos: disallow fbdev initialization if no device is connected
drm/mipi_dsi: create dsi devices only for nodes with reg property
drm/mipi_dsi: add flags to DSI messages
Skip intel_crt_init for Dell XPS 8700
...

Linus Torvalds
2014-04-09 00:52:16 +0800

08 Apr, 2014

28 commits

a7963eb7f Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs ... Browse Code »

Pull ext3 improvements, cleanups, reiserfs fix from Jan Kara:
"various cleanups for ext2, ext3, udf, isofs, a documentation update
for quota, and a fix of a race in reiserfs readdir implementation"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
reiserfs: fix race in readdir
ext2: acl: remove unneeded include of linux/capability.h
ext3: explicitly remove inode from orphan list after failed direct io
fs/isofs/inode.c add __init to init_inodecache()
ext3: Speedup WB_SYNC_ALL pass
fs/quota/Kconfig: Update filesystems
ext3: Update outdated comment before ext3_ordered_writepage()
ext3: Update PF_MEMALLOC handling in ext3_write_inode()
ext2/3: use prandom_u32() instead of get_random_bytes()
ext3: remove an unneeded check in ext3_new_blocks()
ext3: remove unneeded check in ext3_ordered_writepage()
fs: Mark function as static in ext3/xattr_security.c
fs: Mark function as static in ext3/dir.c
fs: Mark function as static in ext2/xattr_security.c
ext3: Add __init macro to init_inodecache
ext2: Add __init macro to init_inodecache
udf: Add __init macro to init_inodecache
fs: udf: parse_options: blocksize check

Linus Torvalds
2014-04-08 08:59:17 +0800
26c12d933 Merge branch 'akpm' (incoming from Andrew) ... Browse Code »

Merge second patch-bomb from Andrew Morton:
- the rest of MM
- zram updates
- zswap updates
- exit
- procfs
- exec
- wait
- crash dump
- lib/idr
- rapidio
- adfs, affs, bfs, ufs
- cris
- Kconfig things
- initramfs
- small amount of IPC material
- percpu enhancements
- early ioremap support
- various other misc things

* emailed patches from Andrew Morton : (156 commits)
MAINTAINERS: update Intel C600 SAS driver maintainers
fs/ufs: remove unused ufs_super_block_third pointer
fs/ufs: remove unused ufs_super_block_second pointer
fs/ufs: remove unused ufs_super_block_first pointer
fs/ufs/super.c: add __init to init_inodecache()
doc/kernel-parameters.txt: add early_ioremap_debug
arm64: add early_ioremap support
arm64: initialize pgprot info earlier in boot
x86: use generic early_ioremap
mm: create generic early_ioremap() support
x86/mm: sparse warning fix for early_memremap
lglock: map to spinlock when !CONFIG_SMP
percpu: add preemption checks to __this_cpu ops
vmstat: use raw_cpu_ops to avoid false positives on preemption checks
slub: use raw_cpu_inc for incrementing statistics
net: replace __this_cpu_inc in route.c with raw_cpu_inc
modules: use raw_cpu_write for initialization of per cpu refcount.
mm: use raw_cpu ops for determining current NUMA node
percpu: add raw_cpu_ops
slub: fix leak of 'name' in sysfs_slab_add
...

Linus Torvalds
2014-04-08 07:38:06 +0800
fe4487d18 fs/ufs: remove unused ufs_super_block_third pointer ... Browse Code »

Pointer 'usb3' to struct ufs_super_block_third acquired via
ubh_get_usb_third() is never used in function
ufs_read_cylinder_structures(). Thus remove it.

Detected by Coverity: CID 139939.

Signed-off-by: Christian Engelmayer
Cc: Evgeniy Dushistov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christian Engelmayer
2014-04-08 07:36:16 +0800
48968a112 fs/ufs: remove unused ufs_super_block_second pointer ... Browse Code »

Pointer 'usb2' to struct ufs_super_block_second acquired via
ubh_get_usb_second() is never used in function ufs_statfs(). Thus
remove it.

Detected by Coverity: CID 139940.

Signed-off-by: Christian Engelmayer
Cc: Evgeniy Dushistov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christian Engelmayer
2014-04-08 07:36:16 +0800
6e0bd34c3 fs/ufs: remove unused ufs_super_block_first pointer ... Browse Code »

Remove occurences of unused pointers to struct ufs_super_block_first
that were acquired via ubh_get_usb_first().

Detected by Coverity: CID 139929 - CID 139936, CID 139940.

Signed-off-by: Christian Engelmayer
Cc: Evgeniy Dushistov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christian Engelmayer
2014-04-08 07:36:16 +0800
76ee47357 fs/ufs/super.c: add __init to init_inodecache() ... Browse Code »

init_inodecache is only called by __init init_ufs_fs.

Signed-off-by: Fabian Frederick
Cc: Evgeniy Dushistov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:36:16 +0800
16caed319 fault-injection: set bounds on what /proc/self/make-it-fail accepts. ... Browse Code »

/proc/self/make-it-fail is a boolean, but accepts any number, including
negative ones. Change variable to unsigned, and cap upper bound at 1.

[akpm@linux-foundation.org: don't make make_it_fail unsigned]
Signed-off-by: Dave Jones
Reviewed-by: Akinobu Mita
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Jones
2014-04-08 07:36:10 +0800
758b44407 fs/bfs/inode.c: add __init to init_inodecache() ... Browse Code »

init_inodecache is only called by __init init_bfs_fs

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:36:08 +0800
8ca577223 affs: add mount option to avoid filename truncates ... Browse Code »

Normal behavior for filenames exceeding specific filesystem limits is to
refuse operation.

AFFS standard name length being only 30 characters against 255 for usual
Linux filesystems, original implementation does filename truncate by
default with a define value AFFS_NO_TRUNCATE which can be enabled but
needs module compilation.

This patch adds 'nofilenametruncate' mount option so that user can
easily activate that feature and avoid a lot of problems (eg overwrite
files ...)

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:36:08 +0800
d40c4d46e fs/affs/dir.c: unlock/brelse dir on failure + code clean-up ... Browse Code »

Commit 0edf977d2ae3 ("[readdir] convert affs") returns directly -EIO
without unlocking dir inode and releasing dir bh when second affs_bread
sequence fails. This patch restores initial behaviour. It also fixes
pr_debug and affs_error to fit in 80 columns + removes reference to
filldir (replaced by dir_emit in the commit above).

Signed-off-by: Fabian Frederick
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:36:08 +0800
adbd319e5 affs: add __init to init_inodecache () ... Browse Code »

init_inodecache is only called by __init init_affs_fs

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:36:08 +0800
894122db4 fs/adfs/super.c: add __init to init_inodecache() ... Browse Code »

init_inodecache is only called by __init init_adfs_fs.

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:36:08 +0800
c4082f36f vmcore: continue vmcore initialization if PT_NOTE is found empty ... Browse Code »

Currently when an empty PT_NOTE is detected, vmcore initialization
fails. It sounds too harsh. Because PT_NOTE could be empty, for
example, one offlined a cpu but never restarted kdump service, and after
crash, PT_NOTE program header is there but no data contains. It's
better to warn about the empty PT_NOTE and continue to initialise
vmcore.

And ultimately the multiple PT_NOTE are merged into a single one, all
empty PT_NOTE are discarded naturally during the merge. So empty
PT_NOTE is not visible to user space and vmcore is as good as expected.

Signed-off-by: WANG Chao
Cc: Vivek Goyal
Cc: HATAYAMA Daisuke
Cc: Greg Pearson
Cc: Baoquan He
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

WANG Chao
2014-04-08 07:36:06 +0800
82e0703b6 include/linux/crash_dump.h: add vmcore_cleanup() prototype ... Browse Code »

Eliminate the following warning in proc/vmcore.c:

fs/proc/vmcore.c:1088:6: warning: no previous prototype for `vmcore_cleanup' [-Wmissing-prototypes]

[akpm@linux-foundation.org: clean up powerpc, remove unneeded EXPORT_SYMBOL]
Signed-off-by: Rashika Kheria
Reviewed-by: Josh Triplett
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rashika Kheria
2014-04-08 07:36:06 +0800
ad86622b4 wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide EXIT_TRACE from user-space ... Browse Code »
14

get_task_state() uses the most significant bit to report the state to
user-space, this means that EXIT_ZOMBIE->EXIT_TRACE->EXIT_DEAD transition
can be noticed via /proc as Z -> X -> Z change. Note that this was
possible even before EXIT_TRACE was introduced.

This is not really bad but imho it make sense to hide EXIT_TRACE from
user-space completely. So the patch simply swaps EXIT_ZOMBIE and
EXIT_DEAD, this way EXIT_TRACE will be seen as EXIT_ZOMBIE by user-space.

Signed-off-by: Oleg Nesterov
Cc: Jan Kratochvil
Cc: Michal Schmidt
Cc: Al Viro
Cc: Lennart Poettering
Cc: Roland McGrath
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-04-08 07:36:06 +0800
23aebe169 exec: kill bprm->tcomm[], simplify the "basename" logic ... Browse Code »

Starting from commit c4ad8f98bef7 ("execve: use 'struct filename *' for
executable name passing") bprm->filename can not go away after
flush_old_exec(), so we do not need to save the binary name in
bprm->tcomm[] added by 96e02d158678 ("exec: fix use-after-free bug in
setup_new_exec()").

And there was never need for filename_to_taskname-like code, we can
simply do set_task_comm(kbasename(filename).

This patch has to change set_task_comm() and trace_task_rename() to
accept "const char *", but I think this change is also good.

Signed-off-by: Oleg Nesterov
Cc: Heiko Carstens
Cc: Steven Rostedt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-04-08 07:36:05 +0800
32ed74a4b procfs: make /proc/*/pagemap 0400 ... Browse Code »

The /proc/*/pagemap contain sensitive information and currently its mode
is 0444. Change this to 0400, so the VFS will prevent unprivileged
processes from getting file descriptors on arbitrary privileged
/proc/*/pagemap files.

This reduces the scope of address space leaking and bypasses by protecting
already running processes.

Signed-off-by: Djalal Harouni
Acked-by: Kees Cook
Acked-by: Andy Lutomirski
Cc: Eric W. Biederman
Cc: Al Viro
Cc: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Djalal Harouni
2014-04-08 07:36:05 +0800
35a35046e procfs: make /proc/*/{stack,syscall,personality} 0400 ... Browse Code »

These procfs files contain sensitive information and currently their
mode is 0444. Change this to 0400, so the VFS will be able to block
unprivileged processes from getting file descriptors on arbitrary
privileged /proc/*/{stack,syscall,personality} files.

This reduces the scope of ASLR leaking and bypasses by protecting already
running processes.

Signed-off-by: Djalal Harouni
Acked-by: Kees Cook
Acked-by: Andy Lutomirski
Cc: Eric W. Biederman
Cc: Al Viro
Cc: Oleg Nesterov
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Djalal Harouni
2014-04-08 07:36:04 +0800
1c44dbc82 fs/proc/inode.c: use RCU_INIT_POINTER(x, NULL) ... Browse Code »

Replace rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)

The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure. And in the
case of the NULL pointer, there is no structure to initialize. So,
rcu_assign_pointer(p, NULL) can be safely converted to
RCU_INIT_POINTER(p, NULL)

Signed-off-by: Monam Agarwal
Cc: "Paul E. McKenney"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Monam Agarwal
2014-04-08 07:36:04 +0800
49d063cb3 proc: show mnt_id in /proc/pid/fdinfo ... Browse Code »

Currently we don't have a way how to determing from which mount point
file has been opened. This information is required for proper dumping
and restoring file descriptos due to presence of mount namespaces. It's
possible, that two file descriptors are opened using the same paths, but
one fd references mount point from one namespace while the other fd --
from other namespace.

$ ls -l /proc/1/fd/1
lrwx------ 1 root root 64 Mar 19 23:54 /proc/1/fd/1 -> /dev/null

$ cat /proc/1/fdinfo/1
pos: 0
flags: 0100002
mnt_id: 16

$ cat /proc/1/mountinfo | grep ^16
16 32 0:4 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,size=1013356k,nr_inodes=253339,mode=755

Signed-off-by: Andrey Vagin
Acked-by: Pavel Emelyanov
Acked-by: Cyrill Gorcunov
Cc: Rob Landley
Cc: Al Viro
Cc: Oleg Nesterov
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrey Vagin
2014-04-08 07:36:04 +0800
f0b5664ba fs/proc/meminfo: meminfo_proc_show(): fix typo in comment ... Browse Code »

It should read "reclaimable slab" and not "reclaimable swap".

Signed-off-by: Luiz Capitulino
Reviewed-by: Rik van Riel
Acked-by: Rafael Aquini
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Luiz Capitulino
2014-04-08 07:36:04 +0800
615d6e875 mm: per-thread vma caching ... Browse Code »
7

This patch is a continuation of efforts trying to optimize find_vma(),
avoiding potentially expensive rbtree walks to locate a vma upon faults.
The original approach (https://lkml.org/lkml/2013/11/1/410), where the
largest vma was also cached, ended up being too specific and random,
thus further comparison with other approaches were needed. There are
two things to consider when dealing with this, the cache hit rate and
the latency of find_vma(). Improving the hit-rate does not necessarily
translate in finding the vma any faster, as the overhead of any fancy
caching schemes can be too high to consider.

We currently cache the last used vma for the whole address space, which
provides a nice optimization, reducing the total cycles in find_vma() by
up to 250%, for workloads with good locality. On the other hand, this
simple scheme is pretty much useless for workloads with poor locality.
Analyzing ebizzy runs shows that, no matter how many threads are
running, the mmap_cache hit rate is less than 2%, and in many situations
below 1%.

The proposed approach is to replace this scheme with a small per-thread
cache, maximizing hit rates at a very low maintenance cost.
Invalidations are performed by simply bumping up a 32-bit sequence
number. The only expensive operation is in the rare case of a seq
number overflow, where all caches that share the same address space are
flushed. Upon a miss, the proposed replacement policy is based on the
page number that contains the virtual address in question. Concretely,
the following results are seen on an 80 core, 8 socket x86-64 box:

1) System bootup: Most programs are single threaded, so the per-thread
scheme does improve ~50% hit rate by just adding a few more slots to
the cache.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 50.61% | 19.90 |
| patched | 73.45% | 13.58 |
+----------------+----------+------------------+

2) Kernel build: This one is already pretty good with the current
approach as we're dealing with good locality.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 75.28% | 11.03 |
| patched | 88.09% | 9.31 |
+----------------+----------+------------------+

3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 70.66% | 17.14 |
| patched | 91.15% | 12.57 |
+----------------+----------+------------------+

4) Ebizzy: There's a fair amount of variation from run to run, but this
approach always shows nearly perfect hit rates, while baseline is just
about non-existent. The amounts of cycles can fluctuate between
anywhere from ~60 to ~116 for the baseline scheme, but this approach
reduces it considerably. For instance, with 80 threads:

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline | 1.06% | 91.54 |
| patched | 99.97% | 14.18 |
+----------------+----------+------------------+

[akpm@linux-foundation.org: fix nommu build, per Davidlohr]
[akpm@linux-foundation.org: document vmacache_valid() logic]
[akpm@linux-foundation.org: attempt to untangle header files]
[akpm@linux-foundation.org: add vmacache_find() BUG_ON]
[hughd@google.com: add vmacache_valid_mm() (from Oleg)]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: adjust and enhance comments]
Signed-off-by: Davidlohr Bueso
Reviewed-by: Rik van Riel
Acked-by: Linus Torvalds
Reviewed-by: Michel Lespinasse
Cc: Oleg Nesterov
Tested-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-04-08 07:35:53 +0800
f1820361f mm: implement ->map_pages for page cache ... Browse Code »

filemap_map_pages() is generic implementation of ->map_pages() for
filesystems who uses page cache.

It should be safe to use filemap_map_pages() for ->map_pages() if
filesystem use filemap_fault() for ->fault().

Signed-off-by: Kirill A. Shutemov
Acked-by: Linus Torvalds
Cc: Mel Gorman
Cc: Rik van Riel
Cc: Andi Kleen
Cc: Matthew Wilcox
Cc: Dave Hansen
Cc: Alexander Viro
Cc: Dave Chinner
Cc: Ning Qu
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2014-04-08 07:35:53 +0800
ab0e113f6 exec: kill the unnecessary mm->def_flags setting in load_elf_binary() ... Browse Code »

load_elf_binary() sets current->mm->def_flags = def_flags and def_flags
is always zero. Not only this looks strange, this is unnecessary
because mm_init() has already set ->def_flags = 0.

Signed-off-by: Alex Thorlton
Suggested-by: Oleg Nesterov
Cc: Gerald Schaefer
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Christian Borntraeger
Cc: Paolo Bonzini
Cc: "Kirill A. Shutemov"
Cc: Mel Gorman
Acked-by: Rik van Riel
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
Cc: Oleg Nesterov
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: Johannes Weiner
Cc: David Rientjes
Cc: Paolo Bonzini
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alex Thorlton
2014-04-08 07:35:52 +0800
87c1b497c ntfs: logging clean-up ... Browse Code »

- Convert spinlock/static array to va_format (inspired by Joe Perches
help on previous logging patches).

- Convert printk(KERN_ERR to pr_warn in __ntfs_warning.

- Convert printk(KERN_ERR to pr_err in __ntfs_error.

- Convert printk(KERN_DEBUG to pr_debug in __ntfs_debug. (Note that
__ntfs_debug is still guarded by #if DEBUG)

- Improve !DEBUG to parse all arguments (Joe Perches).

- Sparse pr_foo() conversions in super.c

NTFS, NTFS-fs prefixes as well as 'warning' and 'error' were removed :
pr_foo() automatically adds module name and error level is already
specified.

Signed-off-by: Fabian Frederick
Cc: Anton Altaparmakov
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-08 07:35:49 +0800
240cd6a81 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull Ceph updates from Sage Weil:
"The biggest chunk is a series of patches from Ilya that add support
for new Ceph osd and crush map features, including some new tunables,
primary affinity, and the new encoding that is needed for erasure
coding support. This brings things into parity with the server side
and the looming firefly release. There is also support for allocation
hints in RBD that help limit fragmentation on the server side.

There is also a series of patches from Zheng fixing NFS reexport,
directory fragmentation support, flock vs fnctl behavior, and some
issues with clustered MDS.

Finally, there are some miscellaneous fixes from Yunchuan Wen for
fscache, Fabian Frederick for ACLs, and from me for fsync(dirfd)
behavior"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (79 commits)
ceph: skip invalid dentry during dcache readdir
libceph: dump pool {read,write}_tier to debugfs
libceph: output primary affinity values on osdmap updates
ceph: flush cap release queue when trimming session caps
ceph: don't grabs open file reference for aborted request
ceph: drop extra open file reference in ceph_atomic_open()
ceph: preallocate buffer for readdir reply
libceph: enable PRIMARY_AFFINITY feature bit
libceph: redo ceph_calc_pg_primary() in terms of ceph_calc_pg_acting()
libceph: add support for osd primary affinity
libceph: add support for primary_temp mappings
libceph: return primary from ceph_calc_pg_acting()
libceph: switch ceph_calc_pg_acting() to new helpers
libceph: introduce apply_temps() helper
libceph: introduce pg_to_raw_osds() and raw_to_up_osds() helpers
libceph: ceph_can_shift_osds(pool) and pool type defines
libceph: ceph_osd_{exists,is_up,is_down}(osd) definitions
libceph: enable OSDMAP_ENC feature bit
libceph: primary_affinity decode bits
libceph: primary_affinity infrastructure
...

Linus Torvalds
2014-04-08 02:09:13 +0800
302111259 Merge tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs ... Browse Code »

Pull f2fs updates from Jaegeuk Kim:
"This patch-set includes the following major enhancement patches.
- introduce large directory support
- introduce f2fs_issue_flush to merge redundant flush commands
- merge write IOs as much as possible aligned to the segment
- add sysfs entries to tune the f2fs configuration
- use radix_tree for the free_nid_list to reduce in-memory operations
- remove costly bit operations in f2fs_find_entry
- enhance the readahead flow for CP/NAT/SIT/SSA blocks

The other bug fixes are as follows:
- recover xattr node blocks correctly after sudden-power-cut
- fix to calculate the maximum number of node ids
- enhance to handle many error cases

And, there are a bunch of cleanups"

* tag 'for-f2fs-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (62 commits)
f2fs: fix wrong statistics of inline data
f2fs: check the acl's validity before setting
f2fs: introduce f2fs_issue_flush to avoid redundant flush issue
f2fs: fix to cover io->bio with io_rwsem
f2fs: fix error path when fail to read inline data
f2fs: use list_for_each_entry{_safe} for simplyfying code
f2fs: avoid free slab cache under spinlock
f2fs: avoid unneeded lookup when xattr name length is too long
f2fs: avoid unnecessary bio submit when wait page writeback
f2fs: return -EIO when node id is not matched
f2fs: avoid RECLAIM_FS-ON-W warning
f2fs: skip unnecessary node writes during fsync
f2fs: introduce fi->i_sem to protect fi's info
f2fs: change reclaim rate in percentage
f2fs: add missing documentation for dir_level
f2fs: remove unnecessary threshold
f2fs: throttle the memory footprint with a sysfs entry
f2fs: avoid to drop nat entries due to the negative nr_shrink
f2fs: call f2fs_wait_on_page_writeback instead of native function
f2fs: introduce nr_pages_to_write for segment alignment
...

Linus Torvalds
2014-04-08 01:55:36 +0800
c29aa153e Merge tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd ... Browse Code »

Pull MTD updates from Brian Norris:
- A few SPI NOR ID definitions
- Kill the NAND "max pagesize" restriction
- Fix some x16 bus-width NAND support
- Add NAND JEDEC parameter page support
- DT bindings for NAND ECC
- GPMI NAND updates (subpage reads)
- More OMAP NAND refactoring
- New STMicro SPI NOR driver (now in 40 patches!)
- A few other random bugfixes

* tag 'for-linus-20140405' of git://git.infradead.org/linux-mtd: (120 commits)
Fix index regression in nand_read_subpage
mtd: diskonchip: mem resource name is not optional
mtd: nand: fix mention to CONFIG_MTD_NAND_ECC_BCH
mtd: nand: fix GET/SET_FEATURES address on 16-bit devices
mtd: omap2: Use devm_ioremap_resource()
mtd: denali_dt: Use devm_ioremap_resource()
mtd: devices: elm: update DRIVER_NAME as "omap-elm"
mtd: devices: elm: configure parallel channels based on ecc_steps
mtd: devices: elm: clean elm_load_syndrome
mtd: devices: elm: check for hardware engine's design constraints
mtd: st_spi_fsm: Succinctly reorganise .remove()
mtd: st_spi_fsm: Allow loop to run at least once before giving up CPU
mtd: st_spi_fsm: Correct vendor name spelling issue - missing "M"
mtd: st_spi_fsm: Avoid duplicating MTD core code
mtd: st_spi_fsm: Remove useless consts from function arguments
mtd: st_spi_fsm: Convert ST SPI FSM (NOR) Flash driver to new DT partitions
mtd: st_spi_fsm: Move runtime configurable msg sequences into device's struct
mtd: st_spi_fsm: Supply the W25Qxxx chip specific configuration call-back
mtd: st_spi_fsm: Supply the S25FLxxx chip specific configuration call-back
mtd: st_spi_fsm: Supply the MX25xxx chip specific configuration call-back
...

Linus Torvalds
2014-04-08 01:17:30 +0800

07 Apr, 2014

3 commits

48b230a58 f2fs: fix wrong statistics of inline data ... Browse Code »

If we remove a file that has inline data after mount, our statistics turns to
inaccurate.

cat /sys/kernel/debug/f2fs/status
- Inline_data Inode: 4294967295

Let's add stat_inc_inline_inode() to stat inline info of the file when lookup.

Change log from v1:
o stat in f2fs_lookup() instead of in do_read_inode() for excluding wrong stat.

Signed-off-by: Chao Yu
Signed-off-by: Jaegeuk Kim

Chao Yu
2014-04-07 11:40:58 +0800
3a8861e27 f2fs: check the acl's validity before setting ... Browse Code »

Before setting the acl, call posix_acl_valid() to check if it is
valid or not.

Signed-off-by: zhangzhen
Signed-off-by: Jaegeuk Kim

ZhangZhen
2014-04-07 11:18:30 +0800
6b4afdd79 f2fs: introduce f2fs_issue_flush to avoid redundant flush issue ... Browse Code »

Some storage devices show relatively high latencies to complete cache_flush
commands, even though their normal IO speed is prettry much high. In such
the case, it needs to merge cache_flush commands as much as possible to avoid
issuing them redundantly.
So, this patch introduces a mount option, "-o flush_merge", to mitigate such
the overhead.

If this option is enabled by user, F2FS merges the cache_flush commands and then
issues just one cache_flush on behalf of them. Once the single command is
finished, F2FS sends a completion signal to all the pending threads.

Note that, this option can be used under a workload consisting of very intensive
concurrent fsync calls, while the storage handles cache_flush commands slowly.

Signed-off-by: Jaegeuk Kim

Jaegeuk Kim
2014-04-07 08:50:58 +0800