Eric Lee / smarc-fsl-linux-kernel

15 Aug, 2014

1 commit

a2a368d90 mm: fix CROSS_MEMORY_ATTACH help text grammar ... Browse Code »

Signed-off-by: Geert Uytterhoeven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2014-08-15 00:56:15 +0800

09 Aug, 2014

8 commits

8065be8d0 Merge branch 'akpm' (second patchbomb from Andrew Morton) ... Browse Code »

Merge more incoming from Andrew Morton:
"Two new syscalls:

memfd_create in "shm: add memfd_create() syscall"
kexec_file_load in "kexec: implementation of new syscall kexec_file_load"

And:

- Most (all?) of the rest of MM

- Lots of the usual misc bits

- fs/autofs4

- drivers/rtc

- fs/nilfs

- procfs

- fork.c, exec.c

- more in lib/

- rapidio

- Janitorial work in filesystems: fs/ufs, fs/reiserfs, fs/adfs,
fs/cramfs, fs/romfs, fs/qnx6.

- initrd/initramfs work

- "file sealing" and the memfd_create() syscall, in tmpfs

- add pci_zalloc_consistent, use it in lots of places

- MAINTAINERS maintenance

- kexec feature work"

* emailed patches from Andrew Morton <akpm@linux-foundation.org: (193 commits)
MAINTAINERS: update nomadik patterns
MAINTAINERS: update usb/gadget patterns
MAINTAINERS: update DMA BUFFER SHARING patterns
kexec: verify the signature of signed PE bzImage
kexec: support kexec/kdump on EFI systems
kexec: support for kexec on panic using new system call
kexec-bzImage64: support for loading bzImage using 64bit entry
kexec: load and relocate purgatory at kernel load time
purgatory: core purgatory functionality
purgatory/sha256: provide implementation of sha256 in purgaotory context
kexec: implementation of new syscall kexec_file_load
kexec: new syscall kexec_file_load() declaration
kexec: make kexec_segment user buffer pointer a union
resource: provide new functions to walk through resources
kexec: use common function for kimage_normal_alloc() and kimage_crash_alloc()
kexec: move segment verification code in a separate function
kexec: rename unusebale_pages to unusable_pages
kernel: build bin2c based on config option CONFIG_BUILD_BIN2C
bin2c: move bin2c in scripts/basic
shm: wait for pins to be released when sealing
...

Linus Torvalds
2014-08-09 06:57:47 +0800
de5b56ba5 kernel: build bin2c based on config option CONFIG_BUILD_BIN2C ... Browse Code »

currently bin2c builds only if CONFIG_IKCONFIG=y. But bin2c will now be
used by kexec too. So make it compilation dependent on CONFIG_BUILD_BIN2C
and this config option can be selected by CONFIG_KEXEC and CONFIG_IKCONFIG.

Signed-off-by: Vivek Goyal
Cc: Borislav Petkov
Cc: Michael Kerrisk
Cc: Yinghai Lu
Cc: Eric Biederman
Cc: H. Peter Anvin
Cc: Matthew Garrett
Cc: Greg Kroah-Hartman
Cc: Dave Young
Cc: WANG Chao
Cc: Baoquan He
Cc: Andy Lutomirski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2014-08-09 06:57:32 +0800
dd4d9fecb init/main.c: code clean-up ... Browse Code »

Fixing some checkpatch warnings(remove global initialization, move
__initdata, coalesce formats ...)

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-08-09 06:57:27 +0800
9687fd910 initramfs: add write error checks ... Browse Code »

On a system with low memory extracting the initramfs may fail. If this
happens the user gets "Failed to execute /init" instead of an initramfs
error.

Check return value of sys_write and call error() when the write was
incomplete or failed.

Signed-off-by: David Engraf
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Engraf
2014-08-09 06:57:26 +0800
d97b07c54 initramfs: support initramfs that is bigger than 2GiB ... Browse Code »

Now with 64bit bzImage and kexec tools, we support ramdisk that size is
bigger than 2g, as we could put it above 4G.

Found compressed initramfs image could not be decompressed properly. It
turns out that image length is int during decompress detection, and it
will become < 0 when length is more than 2G. Furthermore, during
decompressing len as int is used for inbuf count, that has problem too.

Change len to long, that should be ok as on 32 bit platform long is
32bits.

Tested with following compressed initramfs image as root with kexec.
gzip, bzip2, xz, lzma, lzop, lz4.
run time for populate_rootfs():
size name Nehalem-EX Westmere-EX Ivybridge-EX
9034400256 root_img : 26s 24s 30s
3561095057 root_img.lz4 : 28s 27s 27s
3459554629 root_img.lzo : 29s 29s 28s
3219399480 root_img.gz : 64s 62s 49s
2251594592 root_img.xz : 262s 260s 183s
2226366598 root_img.lzma: 386s 376s 277s
2901482513 root_img.bz2 : 635s 599s

Signed-off-by: Yinghai Lu
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Rashika Kheria
Cc: Josh Triplett
Cc: Kyungsik Lee
Cc: P J P
Cc: Al Viro
Cc: Tetsuo Handa
Cc: "Daniel M. Weeks"
Cc: Alexandre Courbot
Cc: Jan Beulich
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yinghai Lu
2014-08-09 06:57:26 +0800
387474399 initramfs: support initrd that is bigger than 2GiB ... Browse Code »

When initrd (compressed or not) is used, kernel report data corrupted with
/dev/ram0.

The root cause:
During initramfs checking, if it is initrd, it will be transferred to
/initrd.image with sys_write.
sys_write only support 2G-4K write, so if the initrd ram is more than
that, /initrd.image will not complete at all.

Add local xwrite to loop calling sys_write to workaround the problem.

Also need to use xwrite in write_buffer() to handle:
image is uncompressed cpio and there is one big file (>2G) in it.
unpack_to_rootfs ===> write_buffer ===> actions[]/do_copy

At the same time, we don't need to worry about sys_read/sys_write in
do_mounts_rd.c::crd_load. As decompressor will have fill/flush and local
buffer that is smaller than 2G.

Test with uncompressed initrd, and compressed ones with gz, bz2, lzma,xz,
lzop.

Signed-off-by: Yinghai Lu
Acked-by: H. Peter Anvin
Cc: Ingo Molnar
Cc: Geert Uytterhoeven
Cc: Tetsuo Handa
Cc: "Daniel M. Weeks"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yinghai Lu
2014-08-09 06:57:26 +0800
4dfe694f6 init: make rootdelay=N consistent with rootwait behaviour ... Browse Code »

Currently rootdelay=N and rootwait behave differently (aside from the
obvious unbounded wait duration) because they are at different places in
the init sequence.

The difference manifests itself for md devices because the call to
md_run_setup() lives between rootdelay and rootwait, so if you try to use
rootdelay=20 to try and allow a slow RAID0 array to assemble, you get
this:

[ 4.526011] sd 6:0:0:0: [sdc] Attached SCSI removable disk
[ 22.972079] md: Waiting for all devices to be available before autodetect

i.e. you've achieved nothing other than delaying the probing 20s, when
what you wanted was a 20s delay _after_ the probing for md devices was
initiated.

Here we move the rootdelay code to be right beside the rootwait code, so
that their behaviour is consistent.

It should be noted that in doing so, the actions based on the
saved_root_name[0] and initrd_load() were previously put on hold by
rootdelay=N and now currently will not be delayed. However, I think
consistent behaviour is more important than matching historical behaviour
of delaying the above two operations.

Signed-off-by: Paul Gortmaker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Gortmaker
2014-08-09 06:57:18 +0800
b3345d7c5 Merge tag 'soc-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc ... Browse Code »

Pull ARM SoC platform changes from Olof Johansson:
"This is the bulk of new SoC enablement and other platform changes for
3.17:

- Samsung S5PV210 has been converted to DT and multiplatform
- Clock drivers and bindings for some of the lower-end i.MX 1/2
platforms
- Kirkwood, one of the popular Marvell platforms, is folded into the
mvebu platform code, removing mach-kirkwood
- Hwmod data for TI AM43xx and DRA7 platforms
- More additions of Renesas shmobile platform support
- Removal of plat-samsung contents that can be removed with S5PV210
being multiplatform/DT-enabled and the other two old platforms
being removed

New platforms (most with only basic support right now):

- Hisilicon X5HD2 settop box chipset is introduced
- Mediatek MT6589 (mobile chipset) is introduced
- Broadcom BCM7xxx settop box chipset is introduced

+ as usual a lot other pieces all over the platform code"

* tag 'soc-for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (240 commits)
ARM: hisi: remove smp from machine descriptor
power: reset: move hisilicon reboot code
ARM: dts: Add hix5hd2-dkb dts file.
ARM: debug: Rename Hi3716 to HIX5HD2
ARM: hisi: enable hix5hd2 SoC
ARM: hisi: add ARCH_HISI
MAINTAINERS: add entry for Broadcom ARM STB architecture
ARM: brcmstb: select GISB arbiter and interrupt drivers
ARM: brcmstb: add infrastructure for ARM-based Broadcom STB SoCs
ARM: configs: enable SMP in bcm_defconfig
ARM: add SMP support for Broadcom mobile SoCs
Documentation: arm: misc updates to Marvell EBU SoC status
Documentation: arm: add URLs to public datasheets for the Marvell Armada XP SoC
ARM: mvebu: fix build without platforms selected
ARM: mvebu: add cpuidle support for Armada 38x
ARM: mvebu: add cpuidle support for Armada 370
cpuidle: mvebu: add Armada 38x support
cpuidle: mvebu: add Armada 370 support
cpuidle: mvebu: rename the driver from armada-370-xp to mvebu-v7
ARM: mvebu: export the SCU address
...

Linus Torvalds
2014-08-09 02:14:29 +0800

07 Aug, 2014

1 commit

23b2899f7 printk: allow increasing the ring buffer depending on the number of CPUs ... Browse Code »

The default size of the ring buffer is too small for machines with a
large amount of CPUs under heavy load. What ends up happening when
debugging is the ring buffer overlaps and chews up old messages making
debugging impossible unless the size is passed as a kernel parameter.
An idle system upon boot up will on average spew out only about one or
two extra lines but where this really matters is on heavy load and that
will vary widely depending on the system and environment.

There are mechanisms to help increase the kernel ring buffer for tracing
through debugfs, and those interfaces even allow growing the kernel ring
buffer per CPU. We also have a static value which can be passed upon
boot. Relying on debugfs however is not ideal for production, and
relying on the value passed upon bootup is can only used *after* an
issue has creeped up. Instead of being reactive this adds a proactive
measure which lets you scale the amount of contributions you'd expect to
the kernel ring buffer under load by each CPU in the worst case
scenario.

We use num_possible_cpus() to avoid complexities which could be
introduced by dynamically changing the ring buffer size at run time,
num_possible_cpus() lets us use the upper limit on possible number of
CPUs therefore avoiding having to deal with hotplugging CPUs on and off.
This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT
which is used to specify the maximum amount of contributions to the
kernel ring buffer in the worst case before the kernel ring buffer flips
over, the size is specified as a power of 2. The total amount of
contributions made by each CPU must be greater than half of the default
kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger
an increase upon bootup. The kernel ring buffer is increased to the
next power of two that would fit the required minimum kernel ring buffer
size plus the additional CPU contribution. For example if LOG_BUF_SHIFT
is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs
in order to trigger an increase of the kernel ring buffer. With a
LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64
possible CPUs to trigger an increase. If you had 128 possible CPUs the
amount of minimum required kernel ring buffer bumps to:

((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB

Since we require the ring buffer to be a power of two the new required
size would be 1024 KB.

This CPU contributions are ignored when the "log_buf_len" kernel
parameter is used as it forces the exact size of the ring buffer to an
expected power of two value.

[pmladek@suse.cz: fix build]
Signed-off-by: Luis R. Rodriguez
Signed-off-by: Petr Mladek
Tested-by: Davidlohr Bueso
Tested-by: Petr Mladek
Reviewed-by: Davidlohr Bueso
Cc: Andrew Lunn
Cc: Stephen Warren
Cc: Michal Hocko
Cc: Petr Mladek
Cc: Joe Perches
Cc: Arun KS
Cc: Kees Cook
Cc: Davidlohr Bueso
Cc: Chris Metcalf
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Luis R. Rodriguez
2014-08-07 09:01:23 +0800

10 Jul, 2014

2 commits

1823172ab Merge branches 'doc.2014.07.08a', 'fixes.2014.07.09a', 'maintainers.2014.07.08b'… ... Browse Code »

…, 'nocbs.2014.07.07a' and 'torture.2014.07.07a' into HEAD

doc.2014.07.08a: Documentation updates.
fixes.2014.07.09a: Miscellaneous fixes.
maintainers.2014.07.08b: Maintainership updates.
nocbs.2014.07.07a: Callback-offloading fixes.
torture.2014.07.07a: Torture-test updates.

Paul E. McKenney
2014-07-10 00:16:54 +0800
ab74fdfd4 rcu: Handle obsolete references to TINY_PREEMPT_RCU ... Browse Code »

Signed-off-by: Paul E. McKenney
Reviewed-by: Lai Jiangshan

Paul E. McKenney
2014-07-10 00:14:17 +0800

08 Jul, 2014

1 commit

b58cc46c5 rcu: Don't offload callbacks unless specifically requested ... Browse Code »

Enabling NO_HZ_FULL currently has the side effect of enabling callback
offloading on all CPUs. This results in lots of additional rcuo kthreads,
and can also increase context switching and wakeups, even in cases where
callback offloading is neither needed nor particularly desirable. This
commit therefore enables callback offloading on a given CPU only if
specifically requested at build time or boot time, or if that CPU has
been specifically designated (again, either at build time or boot time)
as a nohz_full CPU.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2014-07-08 06:13:44 +0800

17 Jun, 2014

1 commit

e6639117d kernel: add calibration_delay_done() ... Browse Code »

Add calibration_delay_done() call and dummy implementation. This allows
architectures to stop accepting registrations for new timer based delay
functions.

Signed-off-by: Peter De Schrijver
Acked-by: Russell King
Signed-off-by: Stephen Warren

Peter De Schrijver
2014-06-17 02:47:39 +0800

12 Jun, 2014

1 commit

4251c2a67 Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module updates from Rusty Russell:
"Most of this is cleaning up various driver sysfs permissions so we can
re-add the perm check (we unified the module param and sysfs checks,
but the module ones were stronger so we weakened them temporarily).

Param parsing gets documented, and also "--" now forces args to be
handed to init (and ignored by the kernel).

Module NX/RO protections get tightened: we now set them before calling
parse_args()"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
module: set nx before marking module MODULE_STATE_COMING.
samples/kobject/: avoid world-writable sysfs files.
drivers/hid/hid-picolcd_fb: avoid world-writable sysfs files.
drivers/staging/speakup/: avoid world-writable sysfs files.
drivers/regulator/virtual: avoid world-writable sysfs files.
drivers/scsi/pm8001/pm8001_ctl.c: avoid world-writable sysfs files.
drivers/hid/hid-lg4ff.c: avoid world-writable sysfs files.
drivers/video/fbdev/sm501fb.c: avoid world-writable sysfs files.
drivers/mtd/devices/docg3.c: avoid world-writable sysfs files.
speakup: fix incorrect perms on speakup_acntsa.c
cpumask.h: silence warning with -Wsign-compare
Documentation: Update kernel-parameters.tx
param: hand arguments after -- straight to init
modpost: Fix resource leak in read_dump()

Linus Torvalds
2014-06-12 07:09:14 +0800

05 Jun, 2014

11 commits

2071b3e34 Merge branch 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next ... Browse Code »

Pull x86-64 espfix changes from Peter Anvin:
"This is the espfix64 code, which fixes the IRET information leak as
well as the associated functionality problem. With this code applied,
16-bit stack segments finally work as intended even on a 64-bit
kernel.

Consequently, this patchset also removes the runtime option that we
added as an interim measure.

To help the people working on Linux kernels for very small systems,
this patchset also makes these compile-time configurable features"

* 'x86/espfix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option"
x86, espfix: Make it possible to disable 16-bit support
x86, espfix: Make espfix64 a Kconfig option, fix UML
x86, espfix: Fix broken header guard
x86, espfix: Move espfix definitions into a separate header file
x86-32, espfix: Remove filter for espfix32 due to race
x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack

Linus Torvalds
2014-06-05 22:46:15 +0800
647f010bf init/main.c: remove an ifdef ... Browse Code »

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2014-06-05 07:54:21 +0800
34a1b7236 kthreads: kill CLONE_KERNEL, change kernel_thread(kernel_init) to avoid CLONE_SIGHAND ... Browse Code »

1. Remove CLONE_KERNEL, it has no users and it is dangerous.

The (old) comment says "List of flags we want to share for kernel
threads" but this is not true, we do not want to share ->sighand by
default. This flag can only be used if the caller is sure that both
parent/child will never play with signals (say, allow_signal/etc).

2. Change rest_init() to clone kernel_init() without CLONE_SIGHAND.

In this case CLONE_SIGHAND does not really hurt, and it looks like
optimization because copy_sighand() can avoid kmem_cache_alloc().

But in fact this only adds the minor pessimization. kernel_init()
is going to exec the init process, and de_thread() will need to
unshare ->sighand and do kmem_cache_alloc(sighand_cachep) anyway,
but it needs to do more work and take tasklist_lock and siglock.

Signed-off-by: Oleg Nesterov
Acked-by: Peter Zijlstra
Acked-by: Steven Rostedt
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Mathieu Desnoyers
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-06-05 07:54:21 +0800
7b0b73d76 init/main.c: add initcall_blacklist kernel parameter ... Browse Code »

When a module is built into the kernel the module_init() function
becomes an initcall. Sometimes debugging through dynamic debug can
help, however, debugging built in kernel modules is typically done by
changing the .config, recompiling, and booting the new kernel in an
effort to determine exactly which module caused a problem.

This patchset can be useful stand-alone or combined with initcall_debug.
There are cases where some initcalls can hang the machine before the
console can be flushed, which can make initcall_debug output inaccurate.
Having the ability to skip initcalls can help further debugging of these
scenarios.

Usage: initcall_blacklist=

ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
the log contains:

blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted

ex) added "initcall_blacklist=foo_bar,sgi_uv_sysfs_init" as a kernel parameter
and the log contains:

blacklisting initcall foo_bar
blacklisting initcall sgi_uv_sysfs_init
...
...
initcall sgi_uv_sysfs_init blacklisted

[akpm@linux-foundation.org: tweak printk text]
Signed-off-by: Prarit Bhargava
Cc: Richard Weinberger
Cc: Andi Kleen
Cc: Josh Boyer
Cc: Rob Landley
Cc: Steven Rostedt
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Frederic Weisbecker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Prarit Bhargava
2014-06-05 07:54:21 +0800
d62cf8152 init/main.c: don't use pr_debug() ... Browse Code »

Pertially revert commit ea676e846a81 ("init/main.c: convert to
pr_foo()").

Unbeknownst to me, pr_debug() is different from the other pr_foo()
levels: pr_debug() is a no-op when DEBUG is not defined.

Happily, init/main.c does have a #define DEBUG so we didn't break
initcall_debug. But the functioning of initcall_debug should not be
dependent upon the presence of that #define DEBUG.

Reported-by: Russell King
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2014-06-05 07:54:21 +0800
a8fe19ebf kernel/printk: use symbolic defines for console loglevels ... Browse Code »

... instead of naked numbers.

Stuff in sysrq.c used to set it to 8 which is supposed to mean above
default level so set it to DEBUG instead as we're terminating/killing all
tasks and we want to be verbose there.

Also, correct the check in x86_64_start_kernel which should be >= as
we're clearly issuing the string there for all debug levels, not only
the magical 10.

Signed-off-by: Borislav Petkov
Acked-by: Kees Cook
Acked-by: Randy Dunlap
Cc: Joe Perches
Cc: Valdis Kletnieks
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Borislav Petkov
2014-06-05 07:54:17 +0800
f6187769d sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL ... Browse Code »

sys_sgetmask and sys_ssetmask are obsolete system calls no longer
supported in libc.

This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert
mode configuration.That option is enabled by default for those
architectures.

Signed-off-by: Fabian Frederick
Cc: Steven Miao
Cc: Mikael Starvik
Cc: Jesper Nilsson
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Michal Simek
Cc: Ralf Baechle
Cc: Koichi Yasutake
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: "David S. Miller"
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Greg Ungerer
Cc: Heiko Carstens
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-05 07:54:14 +0800
226b4ccdc mm/process_vm_access: move config option into init/Kconfig ... Browse Code »

CONFIG_CROSS_MEMORY_ATTACH adds couple syscalls: process_vm_readv and
process_vm_writev, it's a kind of IPC for copying data between processes.
Currently this option is placed inside "Processor type and features".

This patch moves it into "General setup" (where all other arch-independed
syscalls and ipc features are placed) and changes prompt string to less
cryptic.

Signed-off-by: Konstantin Khlebnikov
Cc: Christopher Yeoh
Cc: Davidlohr Bueso
Cc: Hugh Dickins
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Konstantin Khlebnikov
2014-06-05 07:54:12 +0800
dc6f6c97f memcg: kill start_kernel()->mm_init_owner(&init_mm) ... Browse Code »

Remove start_kernel()->mm_init_owner(&init_mm, &init_task).

This doesn't really hurt but unnecessary and misleading. init_task is the
"swapper" thread == current, its ->mm is always NULL. And init_mm can
only be used as ->active_mm, not as ->mm.

mm_init_owner() has a single caller with this patch, perhaps it should
die. mm_init() can initialize ->owner under #ifdef.

Signed-off-by: Oleg Nesterov
Reviewed-by: Michal Hocko
Cc: Balbir Singh
Cc: Johannes Weiner
Cc: KAMEZAWA Hiroyuki
Cc: Michal Hocko
Cc: Peter Chiang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-06-05 07:54:03 +0800
f98bafa06 memcg: kill CONFIG_MM_OWNER ... Browse Code »

CONFIG_MM_OWNER makes no sense. It is not user-selectable, it is only
selected by CONFIG_MEMCG automatically. So we can kill this option in
init/Kconfig and do s/CONFIG_MM_OWNER/CONFIG_MEMCG/ globally.

Signed-off-by: Oleg Nesterov
Acked-by: Michal Hocko
Acked-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2014-06-05 07:54:01 +0800
2ee064687 Documentation/memcg: warn about incomplete kmemcg state ... Browse Code »

Kmemcg is currently under development and lacks some important features.
In particular, it does not have support of kmem reclaim on memory pressure
inside cgroup, which practically makes it unusable in real life. Let's
warn about it in both Kconfig and Documentation to prevent complaints
arising.

Signed-off-by: Vladimir Davydov
Acked-by: Michal Hocko
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
2014-06-05 07:54:00 +0800

22 May, 2014

1 commit

e6ab9a20e Merge commit '7ed6fb9b5a5510e4ef78ab27419184741169978a ' into x86/espfix ... Browse Code »

Merge in Linus' tree with:

fa81511bb0bb x86-64, modify_ldt: Make support for 16-bit segments a runtime option

... reverted, to avoid a conflict. This commit is no longer necessary
with the proper fix in place.

Signed-off-by: H. Peter Anvin

H. Peter Anvin
2014-05-22 06:23:19 +0800

06 May, 2014

1 commit

722a9f929 asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/* ... Browse Code »

As requested by Linus add explicit __visible to the asmlinkage users.
This marks functions visible to assembler.

Tree sweep for rest of tree.

Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1398984278-29319-4-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin

Andi Kleen
2014-05-06 07:07:46 +0800

05 May, 2014

1 commit

197725de6 x86, espfix: Make espfix64 a Kconfig option, fix UML ... Browse Code »

Make espfix64 a hidden Kconfig option. This fixes the x86-64 UML
build which had broken due to the non-existence of init_espfix_bsp()
in UML: since UML uses its own Kconfig, this option does not appear in
the UML build.

This also makes it possible to make support for 16-bit segments a
configuration option, for the people who want to minimize the size of
the kernel.

Reported-by: Ingo Molnar
Signed-off-by: H. Peter Anvin
Cc: Richard Weinberger
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com

H. Peter Anvin
2014-05-05 01:00:49 +0800

01 May, 2014

1 commit

3891a04aa x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack ... Browse Code »

The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer. This
causes some 16-bit software to break, but it also leaks kernel state
to user space. We have a software workaround for that ("espfix") for
the 32-bit kernel, but it relies on a nonzero stack segment base which
is not available in 64-bit mode.

In checkin:

b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
the logic that 16-bit support is crippled on 64-bit kernels anyway (no
V86 support), but it turns out that people are doing stuff like
running old Win16 binaries under Wine and expect it to work.

This works around this by creating percpu "ministacks", each of which
is mapped 2^16 times 64K apart. When we detect that the return SS is
on the LDT, we copy the IRET frame to the ministack and use the
relevant alias to return to userspace. The ministacks are mapped
readonly, so if IRET faults we promote #GP to #DF which is an IST
vector and thus has its own stack; we then do the fixup in the #DF
handler.

(Making #GP an IST exception would make the msr_safe functions unsafe
in NMI/MC context, and quite possibly have other effects.)

Special thanks to:

- Andy Lutomirski, for the suggestion of using very small stack slots
and copy (as opposed to map) the IRET frame there, and for the
suggestion to mark them readonly and let the fault promote to #DF.
- Konrad Wilk for paravirt fixup and testing.
- Borislav Petkov for testing help and useful comments.

Reported-by: Brian Gerst
Signed-off-by: H. Peter Anvin
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Cc: Konrad Rzeszutek Wilk
Cc: Borislav Petkov
Cc: Andrew Lutomriski
Cc: Linus Torvalds
Cc: Dirk Hohndel
Cc: Arjan van de Ven
Cc: comex
Cc: Alexander van Heukelum
Cc: Boris Ostrovsky
Cc: # consider after upstream merge

H. Peter Anvin
2014-05-01 05:14:28 +0800

28 Apr, 2014

1 commit

51e158c12 param: hand arguments after -- straight to init ... Browse Code »

The kernel passes any args it doesn't need through to init, except it
assumes anything containing '.' belongs to the kernel (for a module).
This change means all users can clearly distinguish which arguments
are for init.

For example, the kernel uses debug ("dee-bug") to mean log everything to
the console, where systemd uses the debug from the Scandinavian "day-boog"
meaning "fail to boot". If a future versions uses argv[] instead of
reading /proc/cmdline, this confusion will be avoided.

eg: test 'FOO="this is --foo"' -- 'systemd.debug="true true true"'

Gives:
argv[0] = '/debug-init'
argv[1] = 'test'
argv[2] = 'systemd.debug=true true true'
envp[0] = 'HOME=/'
envp[1] = 'TERM=linux'
envp[2] = 'FOO=this is --foo'

Signed-off-by: Rusty Russell

Rusty Russell
2014-04-28 10:18:34 +0800

19 Apr, 2014

1 commit

82c04ff89 init/Kconfig: move the trusted keyring config option to general setup ... Browse Code »

The SYSTEM_TRUSTED_KEYRING config option is not in any menu, causing it
to show up in the toplevel of the kernel configuration. Fix this by
moving it under the General Setup menu.

Signed-off-by: Peter Foley
Cc: David Howells
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Foley
2014-04-19 07:40:07 +0800

13 Apr, 2014

1 commit

0b747172d Merge git://git.infradead.org/users/eparis/audit ... Browse Code »

Pull audit updates from Eric Paris.

* git://git.infradead.org/users/eparis/audit: (28 commits)
AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
audit: do not cast audit_rule_data pointers pointlesly
AUDIT: Allow login in non-init namespaces
audit: define audit_is_compat in kernel internal header
kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
sched: declare pid_alive as inline
audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
syscall_get_arch: remove useless function arguments
audit: remove stray newline from audit_log_execve_info() audit_panic() call
audit: remove stray newlines from audit_log_lost messages
audit: include subject in login records
audit: remove superfluous new- prefix in AUDIT_LOGIN messages
audit: allow user processes to log from another PID namespace
audit: anchor all pid references in the initial pid namespace
audit: convert PPIDs to the inital PID namespace.
pid: get pid_t ppid of task in init_pid_ns
audit: rename the misleading audit_get_context() to audit_take_context()
audit: Add generic compat syscall support
audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
...

Linus Torvalds
2014-04-13 03:38:53 +0800

08 Apr, 2014

2 commits

6aa7a29aa initramfs: debug detected compression method ... Browse Code »

This can greatly aid in narrowing down the real source of initramfs
problems such as failures related to the compression of the in-kernel
initramfs when an external initramfs is in use as well. Existing errors
are ambiguous as to which initramfs is a problem and why.

[akpm@linux-foundation.org: use pr_debug()]
Signed-off-by: Daniel M. Weeks
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Daniel M. Weeks
2014-04-08 07:36:11 +0800
5d2acfc7b kconfig: make allnoconfig disable options behind EMBEDDED and EXPERT ... Browse Code »

"make allnoconfig" exists to ease testing of minimal configurations.
Documentation/SubmitChecklist includes a note to test with allnoconfig.
This helps catch missing dependencies on common-but-not-required
functionality, which might otherwise go unnoticed.

However, allnoconfig still leaves many symbols enabled, because they're
hidden behind CONFIG_EMBEDDED or CONFIG_EXPERT. For instance, allnoconfig
still has CONFIG_PRINTK and CONFIG_BLOCK enabled, so drivers don't
typically get build-tested with those disabled.

To address this, introduce a new Kconfig option "allnoconfig_y", used on
symbols which only exist to hide other symbols. Set it on CONFIG_EMBEDDED
(which then selects CONFIG_EXPERT). allnoconfig will then disable all the
symbols hidden behind those.

Signed-off-by: Josh Triplett
Tested-by: Paul E. McKenney
Cc: Michal Marek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2014-04-08 07:36:09 +0800

04 Apr, 2014

5 commits

76ca7d1cc Merge branch 'akpm' (incoming from Andrew) ... Browse Code »

Merge first patch-bomb from Andrew Morton:
- Various misc bits
- kmemleak fixes
- small befs, codafs, cifs, efs, freexxfs, hfsplus, minixfs, reiserfs things
- fanotify
- I appear to have become SuperH maintainer
- ocfs2 updates
- direct-io tweaks
- a bit of the MM queue
- printk updates
- MAINTAINERS maintenance
- some backlight things
- lib/ updates
- checkpatch updates
- the rtc queue
- nilfs2 updates
- Small Documentation/ updates

* emailed patches from Andrew Morton : (237 commits)
Documentation/SubmittingPatches: remove references to patch-scripts
Documentation/SubmittingPatches: update some dead URLs
Documentation/filesystems/ntfs.txt: remove changelog reference
Documentation/kmemleak.txt: updates
fs/reiserfs/super.c: add __init to init_inodecache
fs/reiserfs: move prototype declaration to header file
fs/hfsplus/attributes.c: add __init to hfsplus_create_attr_tree_cache()
fs/hfsplus/extents.c: fix concurrent acess of alloc_blocks
fs/hfsplus/extents.c: remove unused variable in hfsplus_get_block
nilfs2: update project's web site in nilfs2.txt
nilfs2: update MAINTAINERS file entries fix
nilfs2: verify metadata sizes read from disk
nilfs2: add FITRIM ioctl support for nilfs2
nilfs2: add nilfs_sufile_trim_fs to trim clean segs
nilfs2: implementation of NILFS_IOCTL_SET_SUINFO ioctl
nilfs2: add nilfs_sufile_set_suinfo to update segment usage
nilfs2: add struct nilfs_suinfo_update and flags
nilfs2: update MAINTAINERS file entries
fs/coda/inode.c: add __init to init_inodecache()
BEFS: logging cleanup
...

Linus Torvalds
2014-04-04 07:22:16 +0800
a68b31080 init/do_mounts.c: fix comment error ... Browse Code »

Signed-off-by: chishanmingshen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

chishanmingshen
2014-04-04 07:21:16 +0800
69369a700 fs, kernel: permit disabling the uselib syscall ... Browse Code »

uselib hasn't been used since libc5; glibc does not use it. Support
turning it off.

When disabled, also omit the load_elf_library implementation from
binfmt_elf.c, which only uselib invokes.

bloat-o-meter:
add/remove: 0/4 grow/shrink: 0/1 up/down: 0/-785 (-785)
function old new delta
padzero 39 36 -3
uselib_flags 20 - -20
sys_uselib 168 - -168
SyS_uselib 168 - -168
load_elf_library 426 - -426

The new CONFIG_USELIB defaults to `y'.

Signed-off-by: Josh Triplett
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josh Triplett
2014-04-04 07:21:05 +0800
6af9f7bf3 sys_sysfs: Add CONFIG_SYSFS_SYSCALL ... Browse Code »

sys_sysfs is an obsolete system call no longer supported by libc.

- This patch adds a default CONFIG_SYSFS_SYSCALL=y

- Option can be turned off in expert mode.

- cond_syscall added to kernel/sys_ni.c

[akpm@linux-foundation.org: tweak Kconfig help text]
Signed-off-by: Fabian Frederick
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-04-04 07:21:05 +0800
32d01dc7b Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup updates from Tejun Heo:
"A lot updates for cgroup:

- The biggest one is cgroup's conversion to kernfs. cgroup took
after the long abandoned vfs-entangled sysfs implementation and
made it even more convoluted over time. cgroup's internal objects
were fused with vfs objects which also brought in vfs locking and
object lifetime rules. Naturally, there are places where vfs rules
don't fit and nasty hacks, such as credential switching or lock
dance interleaving inode mutex and cgroup_mutex with object serial
number comparison thrown in to decide whether the operation is
actually necessary, needed to be employed.

After conversion to kernfs, internal object lifetime and locking
rules are mostly isolated from vfs interactions allowing shedding
of several nasty hacks and overall simplification. This will also
allow implmentation of operations which may affect multiple cgroups
which weren't possible before as it would have required nesting
i_mutexes.

- Various simplifications including dropping of module support,
easier cgroup name/path handling, simplified cgroup file type
handling and task_cg_lists optimization.

- Prepatory changes for the planned unified hierarchy, which is still
a patchset away from being actually operational. The dummy
hierarchy is updated to serve as the default unified hierarchy.
Controllers which aren't claimed by other hierarchies are
associated with it, which BTW was what the dummy hierarchy was for
anyway.

- Various fixes from Li and others. This pull request includes some
patches to add missing slab.h to various subsystems. This was
triggered xattr.h include removal from cgroup.h. cgroup.h
indirectly got included a lot of files which brought in xattr.h
which brought in slab.h.

There are several merge commits - one to pull in kernfs updates
necessary for converting cgroup (already in upstream through
driver-core), others for interfering changes in the fixes branch"

* 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (74 commits)
cgroup: remove useless argument from cgroup_exit()
cgroup: fix spurious lockdep warning in cgroup_exit()
cgroup: Use RCU_INIT_POINTER(x, NULL) in cgroup.c
cgroup: break kernfs active_ref protection in cgroup directory operations
cgroup: fix cgroup_taskset walking order
cgroup: implement CFTYPE_ONLY_ON_DFL
cgroup: make cgrp_dfl_root mountable
cgroup: drop const from @buffer of cftype->write_string()
cgroup: rename cgroup_dummy_root and related names
cgroup: move ->subsys_mask from cgroupfs_root to cgroup
cgroup: treat cgroup_dummy_root as an equivalent hierarchy during rebinding
cgroup: remove NULL checks from [pr_cont_]cgroup_{name|path}()
cgroup: use cgroup_setup_root() to initialize cgroup_dummy_root
cgroup: reorganize cgroup bootstrapping
cgroup: relocate setting of CGRP_DEAD
cpuset: use rcu_read_lock() to protect task_cs()
cgroup_freezer: document freezer_fork() subtleties
cgroup: update cgroup_transfer_tasks() to either succeed or fail
cgroup: drop task_lock() protection around task->cgroups
cgroup: update how a newly forked task gets associated with css_set
...

Linus Torvalds
2014-04-04 04:05:42 +0800