Eric Lee / smarc-fsl-linux-kernel

26 Aug, 2014

1 commit

51c0ff6db mm: Fix CROSS_MEMORY_ATTACH help text grammar ... Browse Code »

Signed-off-by: Geert Uytterhoeven
Signed-off-by: Jiri Kosina

Geert Uytterhoeven
2014-08-26 15:35:55 +0800

10 Jul, 2014

2 commits

1823172ab Merge branches 'doc.2014.07.08a', 'fixes.2014.07.09a', 'maintainers.2014.07.08b'… ... Browse Code »

…, 'nocbs.2014.07.07a' and 'torture.2014.07.07a' into HEAD

doc.2014.07.08a: Documentation updates.
fixes.2014.07.09a: Miscellaneous fixes.
maintainers.2014.07.08b: Maintainership updates.
nocbs.2014.07.07a: Callback-offloading fixes.
torture.2014.07.07a: Torture-test updates.

Paul E. McKenney
2014-07-10 00:16:54 +0800
ab74fdfd4 rcu: Handle obsolete references to TINY_PREEMPT_RCU ... Browse Code »

Signed-off-by: Paul E. McKenney
Reviewed-by: Lai Jiangshan

Paul E. McKenney
2014-07-10 00:14:17 +0800

08 Jul, 2014

1 commit

b58cc46c5 rcu: Don't offload callbacks unless specifically requested ... Browse Code »

Enabling NO_HZ_FULL currently has the side effect of enabling callback
offloading on all CPUs. This results in lots of additional rcuo kthreads,
and can also increase context switching and wakeups, even in cases where
callback offloading is neither needed nor particularly desirable. This
commit therefore enables callback offloading on a given CPU only if
specifically requested at build time or boot time, or if that CPU has
been specifically designated (again, either at build time or boot time)
as a nohz_full CPU.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2014-07-08 06:13:44 +0800

12 Jun, 2014

1 commit

4251c2a67 Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module updates from Rusty Russell:
"Most of this is cleaning up various driver sysfs permissions so we can
re-add the perm check (we unified the module param and sysfs checks,
but the module ones were stronger so we weakened them temporarily).

Param parsing gets documented, and also "--" now forces args to be
handed to init (and ignored by the kernel).

Module NX/RO protections get tightened: we now set them before calling
parse_args()"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: set nx before marking module MODULE_STATE_COMING.
samples/kobject/: avoid world-writable sysfs files.
drivers/hid/hid-picolcd_fb: avoid world-writable sysfs files.
drivers/staging/speakup/: avoid world-writable sysfs files.
drivers/regulator/virtual: avoid world-writable sysfs files.
drivers/scsi/pm8001/pm8001_ctl.c: avoid world-writable sysfs files.
drivers/hid/hid-lg4ff.c: avoid world-writable sysfs files.
drivers/video/fbdev/sm501fb.c: avoid world-writable sysfs files.
drivers/mtd/devices/docg3.c: avoid world-writable sysfs files.
speakup: fix incorrect perms on speakup_acntsa.c
cpumask.h: silence warning with -Wsign-compare
Documentation: Update kernel-parameters.tx
param: hand arguments after -- straight to init
modpost: Fix resource leak in read_dump()

Linus Torvalds
2014-06-12 07:09:14 +0800

05 Jun, 2014

11 commits

2071b3e34 Merge branch 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next ... Browse Code »

Pull x86-64 espfix changes from Peter Anvin:
"This is the espfix64 code, which fixes the IRET information leak as
well as the associated functionality problem. With this code applied,
16-bit stack segments finally work as intended even on a 64-bit
kernel.

Consequently, this patchset also removes the runtime option that we
added as an interim measure.

To help the people working on Linux kernels for very small systems,
this patchset also makes these compile-time configurable features"

* 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option"
x86, espfix: Make it possible to disable 16-bit support
x86, espfix: Make espfix64 a Kconfig option, fix UML
x86, espfix: Fix broken header guard
x86, espfix: Move espfix definitions into a separate header file
x86-32, espfix: Remove filter for espfix32 due to race
x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack

Linus Torvalds
2014-06-05 22:46:15 +0800
647f010bf init/main.c: remove an ifdef ... Browse Code »

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2014-06-05 07:54:21 +0800
34a1b7236 kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND ... Browse Code »

1. Remove CLONE_KERNEL, it has no users and it is dangerous.

The (old) comment says "List of flags we want to share for kernel
threads" but this is not true, we do not want to share ->sighand by
default. This flag can only be used if the caller is sure that both
parent/child will never play with signals (say, allow_signal/etc).

2. Change rest_init() to clone kernel_init() without CLONE_SIGHAND.

In this case CLONE_SIGHAND does not really hurt, and it looks like
optimization because copy_sighand() can avoid kmem_cache_alloc().

But in fact this only adds the minor pessimization. kernel_init()
is going to exec the init process, and de_thread() will need to
unshare ->sighand and do kmem_cache_alloc(sighand_cachep) anyway,
but it needs to do more work and take tasklist_lock and siglock.

Signed-off-by: Oleg Nesterov
Acked-by: Peter Zijlstra
Acked-by: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mathieu Desnoyers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-06-05 07:54:21 +0800
7b0b73d76 init/main.c: add initcall_blacklist kernel parameter ... Browse Code »

When a module is built into the kernel the module_init() function
becomes an initcall. Sometimes debugging through dynamic debug can
help, however, debugging built in kernel modules is typically done by
changing the .config, recompiling, and booting the new kernel in an
effort to determine exactly which module caused a problem.

This patchset can be useful stand-alone or combined with initcall_debug.
There are cases where some initcalls can hang the machine before the
console can be flushed, which can make initcall_debug output inaccurate.
Having the ability to skip initcalls can help further debugging of these
scenarios.

Usage: initcall_blacklist=

ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
the log contains:

blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted

ex) added "initcall_blacklist=foo_bar,sgi_uv_sysfs_init" as a kernel parameter
and the log contains:

blacklisting initcall foo_bar
blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted

[akpm@linux-foundation.org: tweak printk text]
Signed-off-by: Prarit Bhargava
Cc: Richard Weinberger
Cc: Andi Kleen
Cc: Josh Boyer
Cc: Rob Landley
Cc: Steven Rostedt
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Prarit Bhargava
2014-06-05 07:54:21 +0800
d62cf8152 init/main.c: don't use pr_debug() ... Browse Code »

Pertially revert commit ea676e846a81 ("init/main.c: convert to
pr_foo()").

Unbeknownst to me, pr_debug() is different from the other pr_foo()
levels: pr_debug() is a no-op when DEBUG is not defined.

Happily, init/main.c does have a #define DEBUG so we didn't break
initcall_debug. But the functioning of initcall_debug should not be
dependent upon the presence of that #define DEBUG.

Reported-by: Russell King
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2014-06-05 07:54:21 +0800
a8fe19ebf kernel/printk: use symbolic defines for console loglevels ... Browse Code »

... instead of naked numbers.

Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.

Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.

Signed-off-by: Borislav Petkov
Acked-by: Kees Cook
Acked-by: Randy Dunlap
Cc: Joe Perches
Cc: Valdis Kletnieks
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Borislav Petkov
2014-06-05 07:54:17 +0800
f6187769d sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL ... Browse Code »

sys_sgetmask and sys_ssetmask are obsolete system calls no longer
supported in libc.

This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert
mode configuration.That option is enabled by default for those
architectures.

Signed-off-by: Fabian Frederick
Cc: Steven Miao
Cc: Mikael Starvik
Cc: Jesper Nilsson
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Michal Simek
Cc: Ralf Baechle
Cc: Koichi Yasutake
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: "David S. Miller"
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Greg Ungerer
Cc: Heiko Carstens
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-05 07:54:14 +0800
226b4ccdc mm/process_vm_access: move config option into init/Kconfig ... Browse Code »

CONFIG_CROSS_MEMORY_ATTACH adds couple syscalls: process_vm_readv and
process_vm_writev, it's a kind of IPC for copying data between processes.
Currently this option is placed inside "Processor type and features".

This patch moves it into "General setup" (where all other arch-independed
syscalls and ipc features are placed) and changes prompt string to less
cryptic.

Signed-off-by: Konstantin Khlebnikov
Cc: Christopher Yeoh
Cc: Davidlohr Bueso
Cc: Hugh Dickins
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Konstantin Khlebnikov
2014-06-05 07:54:12 +0800
dc6f6c97f memcg: kill start_kernel()->mm_init_owner(&init_mm) ... Browse Code »

Remove start_kernel()->mm_init_owner(&init_mm, &init_task).

This doesn't really hurt but unnecessary and misleading. init_task is the
"swapper" thread == current, its ->mm is always NULL. And init_mm can
only be used as ->active_mm, not as ->mm.

mm_init_owner() has a single caller with this patch, perhaps it should
die. mm_init() can initialize ->owner under #ifdef.

Signed-off-by: Oleg Nesterov
Reviewed-by: Michal Hocko
Cc: Balbir Singh
Cc: Johannes Weiner
Cc: KAMEZAWA Hiroyuki
Cc: Michal Hocko
Cc: Peter Chiang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-06-05 07:54:03 +0800
f98bafa06 memcg: kill CONFIG_MM_OWNER ... Browse Code »

CONFIG_MM_OWNER makes no sense. It is not user-selectable, it is only
selected by CONFIG_MEMCG automatically. So we can kill this option in
init/Kconfig and do s/CONFIG_MM_OWNER/CONFIG_MEMCG/ globally.

Signed-off-by: Oleg Nesterov
Acked-by: Michal Hocko
Acked-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-06-05 07:54:01 +0800
2ee064687 Documentation/memcg: warn about incomplete kmemcg state ... Browse Code »

Kmemcg is currently under development and lacks some important features.
In particular, it does not have support of kmem reclaim on memory pressure
inside cgroup, which practically makes it unusable in real life. Let's
warn about it in both Kconfig and Documentation to prevent complaints
arising.

Signed-off-by: Vladimir Davydov
Acked-by: Michal Hocko
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
2014-06-05 07:54:00 +0800

22 May, 2014

1 commit

e6ab9a20e Merge commit '7ed6fb9b5a5510e4ef78ab27419184741169978a ' into x86/espfix ... Browse Code »

Merge in Linus' tree with:

fa81511bb0bb x86-64, modify_ldt: Make support for 16-bit segments a runtime option

... reverted, to avoid a conflict. This commit is no longer necessary
with the proper fix in place.

Signed-off-by: H. Peter Anvin

H. Peter Anvin
2014-05-22 06:23:19 +0800

06 May, 2014

1 commit

722a9f929 asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/* ... Browse Code »

As requested by Linus add explicit __visible to the asmlinkage users.
This marks functions visible to assembler.

Tree sweep for rest of tree.

Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1398984278-29319-4-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin

Andi Kleen
2014-05-06 07:07:46 +0800

05 May, 2014

1 commit

197725de6 x86, espfix: Make espfix64 a Kconfig option, fix UML ... Browse Code »

Make espfix64 a hidden Kconfig option. This fixes the x86-64 UML
build which had broken due to the non-existence of init_espfix_bsp()
in UML: since UML uses its own Kconfig, this option does not appear in
the UML build.

This also makes it possible to make support for 16-bit segments a
configuration option, for the people who want to minimize the size of
the kernel.

Reported-by: Ingo Molnar
Signed-off-by: H. Peter Anvin
Cc: Richard Weinberger
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com

H. Peter Anvin
2014-05-05 01:00:49 +0800

01 May, 2014

1 commit

3891a04aa x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack ... Browse Code »

The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer. This
causes some 16-bit software to break, but it also leaks kernel state
to user space. We have a software workaround for that ("espfix") for
the 32-bit kernel, but it relies on a nonzero stack segment base which
is not available in 64-bit mode.

In checkin:

b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
the logic that 16-bit support is crippled on 64-bit kernels anyway (no
V86 support), but it turns out that people are doing stuff like
running old Win16 binaries under Wine and expect it to work.

This works around this by creating percpu "ministacks", each of which
is mapped 2^16 times 64K apart. When we detect that the return SS is
on the LDT, we copy the IRET frame to the ministack and use the
relevant alias to return to userspace. The ministacks are mapped
readonly, so if IRET faults we promote #GP to #DF which is an IST
vector and thus has its own stack; we then do the fixup in the #DF
handler.

(Making #GP an IST exception would make the msr_safe functions unsafe
in NMI/MC context, and quite possibly have other effects.)

Special thanks to:

- Andy Lutomirski, for the suggestion of using very small stack slots
and copy (as opposed to map) the IRET frame there, and for the
suggestion to mark them readonly and let the fault promote to #DF.
- Konrad Wilk for paravirt fixup and testing.
- Borislav Petkov for testing help and useful comments.

Reported-by: Brian Gerst
Signed-off-by: H. Peter Anvin
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Cc: Konrad Rzeszutek Wilk
Cc: Borislav Petkov
Cc: Andrew Lutomriski
Cc: Linus Torvalds
Cc: Dirk Hohndel
Cc: Arjan van de Ven
Cc: comex
Cc: Alexander van Heukelum
Cc: Boris Ostrovsky
Cc: # consider after upstream merge

H. Peter Anvin
2014-05-01 05:14:28 +0800

28 Apr, 2014

1 commit

51e158c12 param: hand arguments after -- straight to init ... Browse Code »

The kernel passes any args it doesn't need through to init, except it
assumes anything containing '.' belongs to the kernel (for a module).
This change means all users can clearly distinguish which arguments
are for init.

For example, the kernel uses debug ("dee-bug") to mean log everything to
the console, where systemd uses the debug from the Scandinavian "day-boog"
meaning "fail to boot". If a future versions uses argv[] instead of
reading /proc/cmdline, this confusion will be avoided.

eg: test 'FOO="this is --foo"' -- 'systemd.debug="true true true"'

Gives:
argv[0] = '/debug-init'
argv[1] = 'test'
argv[2] = 'systemd.debug=true true true'
envp[0] = 'HOME=/'
envp[1] = 'TERM=linux'
envp[2] = 'FOO=this is --foo'

Signed-off-by: Rusty Russell

Rusty Russell
2014-04-28 10:18:34 +0800

19 Apr, 2014

1 commit

82c04ff89 init/Kconfig: move the trusted keyring config option to general setup ... Browse Code »

The SYSTEM_TRUSTED_KEYRING config option is not in any menu, causing it
to show up in the toplevel of the kernel configuration. Fix this by
moving it under the General Setup menu.

Signed-off-by: Peter Foley
Cc: David Howells
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Foley
2014-04-19 07:40:07 +0800

13 Apr, 2014

1 commit

0b747172d Merge git://git.infradead.org/users/eparis/audit ... Browse Code »

Pull audit updates from Eric Paris.

* git://git.infradead.org/users/eparis/audit: (28 commits)
AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
audit: do not cast audit_rule_data pointers pointlesly
AUDIT: Allow login in non-init namespaces
audit: define audit_is_compat in kernel internal header
kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
sched: declare pid_alive as inline
audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
syscall_get_arch: remove useless function arguments
audit: remove stray newline from audit_log_execve_info() audit_panic() call
audit: remove stray newlines from audit_log_lost messages
audit: include subject in login records
audit: remove superfluous new- prefix in AUDIT_LOGIN messages
audit: allow user processes to log from another PID namespace
audit: anchor all pid references in the initial pid namespace
audit: convert PPIDs to the inital PID namespace.
pid: get pid_t ppid of task in init_pid_ns
audit: rename the misleading audit_get_context() to audit_take_context()
audit: Add generic compat syscall support
audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
...

Linus Torvalds
2014-04-13 03:38:53 +0800

08 Apr, 2014

2 commits

6aa7a29aa initramfs: debug detected compression method ... Browse Code »

This can greatly aid in narrowing down the real source of initramfs
problems such as failures related to the compression of the in-kernel
initramfs when an external initramfs is in use as well. Existing errors
are ambiguous as to which initramfs is a problem and why.

[akpm@linux-foundation.org: use pr_debug()]
Signed-off-by: Daniel M. Weeks
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel M. Weeks
2014-04-08 07:36:11 +0800
5d2acfc7b kconfig: make allnoconfig disable options behind EMBEDDED and EXPERT ... Browse Code »

"make allnoconfig" exists to ease testing of minimal configurations.
Documentation/SubmitChecklist includes a note to test with allnoconfig.
This helps catch missing dependencies on common-but-not-required
functionality, which might otherwise go unnoticed.

However, allnoconfig still leaves many symbols enabled, because they're
hidden behind CONFIG_EMBEDDED or CONFIG_EXPERT. For instance, allnoconfig
still has CONFIG_PRINTK and CONFIG_BLOCK enabled, so drivers don't
typically get build-tested with those disabled.

To address this, introduce a new Kconfig option "allnoconfig_y", used on
symbols which only exist to hide other symbols. Set it on CONFIG_EMBEDDED
(which then selects CONFIG_EXPERT). allnoconfig will then disable all the
symbols hidden behind those.

Signed-off-by: Josh Triplett
Tested-by: Paul E. McKenney
Cc: Michal Marek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2014-04-08 07:36:09 +0800

04 Apr, 2014

5 commits

76ca7d1cc Merge branch 'akpm' (incoming from Andrew) ... Browse Code »

Merge first patch-bomb from Andrew Morton:
- Various misc bits
- kmemleak fixes
- small befs, codafs, cifs, efs, freexxfs, hfsplus, minixfs, reiserfs things
- fanotify
- I appear to have become SuperH maintainer
- ocfs2 updates
- direct-io tweaks
- a bit of the MM queue
- printk updates
- MAINTAINERS maintenance
- some backlight things
- lib/ updates
- checkpatch updates
- the rtc queue
- nilfs2 updates
- Small Documentation/ updates

* emailed patches from Andrew Morton : (237 commits)
Documentation/SubmittingPatches: remove references to patch-scripts
Documentation/SubmittingPatches: update some dead URLs
Documentation/filesystems/ntfs.txt: remove changelog reference
Documentation/kmemleak.txt: updates
fs/reiserfs/super.c: add __init to init_inodecache
fs/reiserfs: move prototype declaration to header file
fs/hfsplus/attributes.c: add __init to hfsplus_create_attr_tree_cache()
fs/hfsplus/extents.c: fix concurrent acess of alloc_blocks
fs/hfsplus/extents.c: remove unused variable in hfsplus_get_block
nilfs2: update project's web site in nilfs2.txt
nilfs2: update MAINTAINERS file entries fix
nilfs2: verify metadata sizes read from disk
nilfs2: add FITRIM ioctl support for nilfs2
nilfs2: add nilfs_sufile_trim_fs to trim clean segs
nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl
nilfs2: add nilfs_sufile_set_suinfo to update segment usage
nilfs2: add struct nilfs_suinfo_update and flags
nilfs2: update MAINTAINERS file entries
fs/coda/inode.c: add __init to init_inodecache()
BEFS: logging cleanup
...

Linus Torvalds
2014-04-04 07:22:16 +0800
a68b31080 init/do_mounts.c: fix comment error ... Browse Code »

Signed-off-by: chishanmingshen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

chishanmingshen
2014-04-04 07:21:16 +0800
69369a700 fs, kernel: permit disabling the uselib syscall ... Browse Code »

uselib hasn't been used since libc5; glibc does not use it. Support
turning it off.

When disabled, also omit the load_elf_library implementation from
binfmt_elf.c, which only uselib invokes.

bloat-o-meter:
add/remove: 0/4 grow/shrink: 0/1 up/down: 0/-785 (-785)
function old new delta
padzero 39 36 -3
uselib_flags 20 - -20
sys_uselib 168 - -168
SyS_uselib 168 - -168
load_elf_library 426 - -426

The new CONFIG_USELIB defaults to `y'.

Signed-off-by: Josh Triplett
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2014-04-04 07:21:05 +0800
6af9f7bf3 sys_sysfs: Add CONFIG_SYSFS_SYSCALL ... Browse Code »

sys_sysfs is an obsolete system call no longer supported by libc.

- This patch adds a default CONFIG_SYSFS_SYSCALL=y

- Option can be turned off in expert mode.

- cond_syscall added to kernel/sys_ni.c

[akpm@linux-foundation.org: tweak Kconfig help text]
Signed-off-by: Fabian Frederick
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-04 07:21:05 +0800
32d01dc7b Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup updates from Tejun Heo:
"A lot updates for cgroup:

- The biggest one is cgroup's conversion to kernfs. cgroup took
after the long abandoned vfs-entangled sysfs implementation and
made it even more convoluted over time. cgroup's internal objects
were fused with vfs objects which also brought in vfs locking and
object lifetime rules. Naturally, there are places where vfs rules
don't fit and nasty hacks, such as credential switching or lock
dance interleaving inode mutex and cgroup_mutex with object serial
number comparison thrown in to decide whether the operation is
actually necessary, needed to be employed.

After conversion to kernfs, internal object lifetime and locking
rules are mostly isolated from vfs interactions allowing shedding
of several nasty hacks and overall simplification. This will also
allow implmentation of operations which may affect multiple cgroups
which weren't possible before as it would have required nesting
i_mutexes.

- Various simplifications including dropping of module support,
easier cgroup name/path handling, simplified cgroup file type
handling and task_cg_lists optimization.

- Prepatory changes for the planned unified hierarchy, which is still
a patchset away from being actually operational. The dummy
hierarchy is updated to serve as the default unified hierarchy.
Controllers which aren't claimed by other hierarchies are
associated with it, which BTW was what the dummy hierarchy was for
anyway.

- Various fixes from Li and others. This pull request includes some
patches to add missing slab.h to various subsystems. This was
triggered xattr.h include removal from cgroup.h. cgroup.h
indirectly got included a lot of files which brought in xattr.h
which brought in slab.h.

There are several merge commits - one to pull in kernfs updates
necessary for converting cgroup (already in upstream through
driver-core), others for interfering changes in the fixes branch"

* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
cgroup: remove useless argument from cgroup_exit()
cgroup: fix spurious lockdep warning in cgroup_exit()
cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
cgroup: break kernfs active_ref protection in cgroup directory operations
cgroup: fix cgroup_taskset walking order
cgroup: implement CFTYPE_ONLY_ON_DFL
cgroup: make cgrp_dfl_root mountable
cgroup: drop const from @buffer of cftype->write_string()
cgroup: rename cgroup_dummy_root and related names
cgroup: move ->subsys_mask from cgroupfs_root to cgroup
cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
cgroup: reorganize cgroup bootstrapping
cgroup: relocate setting of CGRP_DEAD
cpuset: use rcu_read_lock() to protect task_cs()
cgroup_freezer: document freezer_fork() subtleties
cgroup: update cgroup_transfer_tasks() to either succeed or fail
cgroup: drop task_lock() protection around task->cgroups
cgroup: update how a newly forked task gets associated with css_set
...

Linus Torvalds
2014-04-04 04:05:42 +0800

01 Apr, 2014

1 commit

462bf234a Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull core locking updates from Ingo Molnar:
"The biggest change is the MCS spinlock generalization changes from Tim
Chen, Peter Zijlstra, Jason Low et al. There's also lockdep
fixes/enhancements from Oleg Nesterov, in particular a false negative
fix related to lockdep_set_novalidate_class() usage"

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
locking/mutex: Fix debug checks
locking/mutexes: Add extra reschedule point
locking/mutexes: Introduce cancelable MCS lock for adaptive spinning
locking/mutexes: Unlock the mutex without the wait_lock
locking/mutexes: Modify the way optimistic spinners are queued
locking/mutexes: Return false if task need_resched() in mutex_can_spin_on_owner()
locking: Move mcs_spinlock.h into kernel/locking/
m68k: Skip futex_atomic_cmpxchg_inatomic() test
futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test
Revert "sched/wait: Suppress Sparse 'variable shadowing' warning"
lockdep: Change lockdep_set_novalidate_class() to use _and_name
lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate
lockdep: Don't create the wrong dependency on hlock->check == 0
lockdep: Make held_lock->check and "int check" argument bool
locking/mcs: Allow architecture specific asm files to be used for contended case
locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order
sched/wait: Suppress Sparse 'variable shadowing' warning
hung_task/Documentation: Fix hung_task_warnings description
locking/mcs: Allow architectures to hook in to contended paths
locking/mcs: Micro-optimize the MCS code, add extra comments
...

Linus Torvalds
2014-04-01 01:59:39 +0800

20 Mar, 2014

2 commits

7a0177212 audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL ... Browse Code »

Currently AUDITSYSCALL has a long list of architecture depencency:
depends on AUDIT && (X86 || PARISC || PPC || S390 || IA64 || UML ||
SPARC64 || SUPERH || (ARM && AEABI && !OABI_COMPAT) || ALPHA)
The purpose of this patch is to replace it with HAVE_ARCH_AUDITSYSCALL
for simplicity.

Signed-off-by: AKASHI Takahiro
Acked-by: Will Deacon (arm)
Acked-by: Richard Guy Briggs (audit)
Acked-by: Matt Turner (alpha)
Acked-by: Michael Ellerman (powerpc)
Signed-off-by: Eric Paris

AKASHI Takahiro
2014-03-20 22:11:10 +0800
015d991f7 alpha: Enable system-call auditing support. ... Browse Code »

Signed-off-by: Zhenglong.cai
Signed-off-by: Matt Turner

蔡正龙
2014-03-20 22:11:09 +0800

13 Mar, 2014

1 commit

c4e1acbb3 ACPI / init: Invoke early ACPI initialization later ... Browse Code »

Commit 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before
timekeeping_init()) optimistically moved the early ACPI initialization
before timekeeping_init(), but that didn't work, because it broke fast
TSC calibration for Julian Wollrath on Thinkpad x121e (and most likely
for others too). The reason is that acpi_early_init() enables the SCI
and that interferes with the fast TSC calibration mechanism.

Thus follow the original idea to execute acpi_early_init() before
efi_enter_virtual_mode() to help the EFI people for now and we can
revisit the other problem that commit 73f7d1ca3263 attempted to
address in the future (if really necessary).

Fixes: 73f7d1ca3263 (ACPI / init: Run acpi_early_init() before timekeeping_init())
Reported-by: Julian Wollrath
Reviewed-by: Thomas Gleixner
Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2014-03-13 07:53:51 +0800

03 Mar, 2014

1 commit

03b8c7b62 futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test ... Browse Code »

If an architecture has futex_atomic_cmpxchg_inatomic() implemented and there
is no runtime check necessary, allow to skip the test within futex_init().

This allows to get rid of some code which would always give the same result,
and also allows the compiler to optimize a couple of if statements away.

Signed-off-by: Heiko Carstens
Cc: Finn Thain
Cc: Geert Uytterhoeven
Link: http://lkml.kernel.org/r/20140302120947.GA3641@osiris
Signed-off-by: Thomas Gleixner

Heiko Carstens
2014-03-03 18:32:08 +0800

12 Feb, 2014

1 commit

2bd59d48e cgroup: convert to kernfs ... Browse Code »

cgroup filesystem code was derived from the original sysfs
implementation which was heavily intertwined with vfs objects and
locking with the goal of re-using the existing vfs infrastructure.
That experiment turned out rather disastrous and sysfs switched, a
long time ago, to distributed filesystem model where a separate
representation is maintained which is queried by vfs. Unfortunately,
cgroup stuck with the failed experiment all these years and
accumulated even more problems over time.

Locking and object lifetime management being entangled with vfs is
probably the most egregious. vfs is never designed to be misused like
this and cgroup ends up jumping through various convoluted dancing to
make things work. Even then, operations across multiple cgroups can't
be done safely as it'll deadlock with rename locking.

Recently, kernfs is separated out from sysfs so that it can be used by
users other than sysfs. This patch converts cgroup to use kernfs,
which will bring the following benefits.

* Separation from vfs internals. Locking and object lifetime
management is contained in cgroup proper making things a lot
simpler. This removes significant amount of locking convolutions,
hairy object lifetime rules and the restriction on multi-cgroup
operations.

* Can drop a lot of code to implement filesystem interface as most are
provided by kernfs.

* Proper "severing" semantics, which allows controllers to not worry
about lingering file accesses after offline.

While the preceding patches did as much as possible to make the
transition less painful, large part of the conversion has to be one
discrete step making this patch rather large. The rest of the commit
message lists notable changes in different areas.

Overall
-------

* vfs constructs replaced with kernfs ones. cgroup->dentry w/ ->kn,
cgroupfs_root->sb w/ ->kf_root.

* All dentry accessors are removed. Helpers to map from kernfs
constructs are added.

* All vfs plumbing around dentry, inode and bdi removed.

* cgroup_mount() now directly looks for matching root and then
proceeds to create a new one if not found.

Synchronization and object lifetime
-----------------------------------

* vfs inode locking removed. Among other things, this removes the
need for the convolution in cgroup_cfts_commit(). Future patches
will further simplify it.

* vfs refcnting replaced with cgroup internal ones. cgroup->refcnt,
cgroupfs_root->refcnt added. cgroup_put_root() now directly puts
root->refcnt and when it reaches zero proceeds to destroy it thus
merging cgroup_put_root() and the former cgroup_kill_sb().
Simliarly, cgroup_put() now directly schedules cgroup_free_rcu()
when refcnt reaches zero.

* Unlike before, kernfs objects don't hold onto cgroup objects. When
cgroup destroys a kernfs node, all existing operations are drained
and the association is broken immediately. The same for
cgroupfs_roots and mounts.

* All operations which come through kernfs guarantee that the
associated cgroup is and stays valid for the duration of operation;
however, there are two paths which need to find out the associated
cgroup from dentry without going through kernfs -
css_tryget_from_dir() and cgroupstats_build(). For these two,
kernfs_node->priv is RCU managed so that they can dereference it
under RCU read lock.

File and directory handling
---------------------------

* File and directory operations converted to kernfs_ops and
kernfs_syscall_ops.

* xattrs is implicitly supported by kernfs. No need to worry about it
from cgroup. This means that "xattr" mount option is no longer
necessary. A future patch will add a deprecated warning message
when sane_behavior.

* When cftype->max_write_len > PAGE_SIZE, it's necessary to make a
private copy of one of the kernfs_ops to set its atomic_write_len.
cftype->kf_ops is added and cgroup_init/exit_cftypes() are updated
to handle it.

* cftype->lockdep_key added so that kernfs lockdep annotation can be
per cftype.

* Inidividual file entries and open states are now managed by kernfs.
No need to worry about them from cgroup. cfent, cgroup_open_file
and their friends are removed.

* kernfs_nodes are created deactivated and kernfs_activate()
invocations added to places where creation of new nodes are
committed.

* cgroup_rmdir() uses kernfs_[un]break_active_protection() for
self-removal.

v2: - Li pointed out in an earlier patch that specifying "name="
during mount without subsystem specification should succeed if
there's an existing hierarchy with a matching name although it
should fail with -EINVAL if a new hierarchy should be created.
Prior to the conversion, this used by handled by deferring
failure from NULL return from cgroup_root_from_opts(), which was
necessary because root was being created before checking for
existing ones. Note that cgroup_root_from_opts() returned an
ERR_PTR() value for error conditions which require immediate
mount failure.

As we now have separate search and creation steps, deferring
failure from cgroup_root_from_opts() is no longer necessary.
cgroup_root_from_opts() is updated to always return ERR_PTR()
value on failure.

- The logic to match existing roots is updated so that a mount
attempt with a matching name but different subsys_mask are
rejected. This was handled by a separate matching loop under
the comment "Check for name clashes with existing mounts" but
got lost during conversion. Merge the check into the main
search loop.

- Add __rcu __force casting in RCU_INIT_POINTER() in
cgroup_destroy_locked() to avoid the sparse address space
warning reported by kbuild test bot. Maybe we want an explicit
interface to use kn->priv as RCU protected pointer?

v3: Make CONFIG_CGROUPS select CONFIG_KERNFS.

v4: Rebased on top of 0ab02ca8f887 ("cgroup: protect modifications to
cgroup_idr with cgroup_mutex").

Signed-off-by: Tejun Heo
Acked-by: Li Zefan
Cc: kbuild test robot fengguang.wu@intel.com>

Tejun Heo
2014-02-12 00:52:49 +0800

06 Feb, 2014

1 commit

c4ad8f98b execve: use 'struct filename *' for executable name passing ... Browse Code »

This changes 'do_execve()' to get the executable name as a 'struct
filename', and to free it when it is done. This is what the normal
users want, and it simplifies and streamlines their error handling.

The controlled lifetime of the executable name also fixes a
use-after-free problem with the trace_sched_process_exec tracepoint: the
lifetime of the passed-in string for kernel users was not at all
obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize
the pathname allocation lifetime with the execve() having finished,
which in turn meant that the trace point that happened after
mm_release() of the old process VM ended up using already free'd memory.

To solve the kernel string lifetime issue, this simply introduces
"getname_kernel()" that works like the normal user-space getname()
function, except with the source coming from kernel memory.

As Oleg points out, this also means that we could drop the tcomm[] array
from 'struct linux_binprm', since the pathname lifetime now covers
setup_new_exec(). That would be a separate cleanup.

Reported-by: Igor Zhbanov
Tested-by: Steven Rostedt
Cc: Oleg Nesterov
Cc: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2014-02-06 04:54:53 +0800

01 Feb, 2014

1 commit

a9302e843 alpha: Enable system-call auditing support. ... Browse Code »

Signed-off-by: Zhenglong.cai
Signed-off-by: Matt Turner

蔡正龙
2014-02-01 01:21:55 +0800

29 Jan, 2014

1 commit

bf3d846b7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs updates from Al Viro:
"Assorted stuff; the biggest pile here is Christoph's ACL series. Plus
assorted cleanups and fixes all over the place...

There will be another pile later this week"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits)
__dentry_path() fixes
vfs: Remove second variable named error in __dentry_path
vfs: Is mounted should be testing mnt_ns for NULL or error.
Fix race when checking i_size on direct i/o read
hfsplus: remove can_set_xattr
nfsd: use get_acl and ->set_acl
fs: remove generic_acl
nfs: use generic posix ACL infrastructure for v3 Posix ACLs
gfs2: use generic posix ACL infrastructure
jfs: use generic posix ACL infrastructure
xfs: use generic posix ACL infrastructure
reiserfs: use generic posix ACL infrastructure
ocfs2: use generic posix ACL infrastructure
jffs2: use generic posix ACL infrastructure
hfsplus: use generic posix ACL infrastructure
f2fs: use generic posix ACL infrastructure
ext2/3/4: use generic posix ACL infrastructure
btrfs: use generic posix ACL infrastructure
fs: make posix_acl_create more useful
fs: make posix_acl_chmod more useful
...

Linus Torvalds
2014-01-29 00:38:04 +0800

28 Jan, 2014

1 commit

729abd2ba init/main.c: remove unused declaractions of mca_init() and sbus_init() ... Browse Code »

mca_init() no longer exists.
sbus_init() is defined in arch/sparc/kernel/sbus.c and is a subsys_initcall.
both are not needed in main.c any more.

Signed-off-by: Kang Hu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kang Hu
2014-01-28 13:02:39 +0800