Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

20 Feb, 2015

2 commits

b11a27839 Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild ... Browse Code »

Pull kconfig updates from Michal Marek:
"Yann E Morin was supposed to take over kconfig maintainership, but
this hasn't happened. So I'm sending a few kconfig patches that I
collected:

- Fix for missing va_end in kconfig
- merge_config.sh displays used if given too few arguments
- s/boolean/bool/ in Kconfig files for consistency, with the plan to
only support bool in the future"

* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kconfig: use va_end to match corresponding va_start
merge_config.sh: Display usage if given too few arguments
kconfig: use bool instead of boolean for type definition attributes

Linus Torvalds
2015-02-20 02:36:45 +0800
773433433 Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild ... Browse Code »

Pull misc kbuild changes from Michal Marek:
"Just a few non-critical kbuild changes:

- builddeb adds the actual distribution name in the changelog
- documentation fixes"

* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kbuild: trivial - fix the help doc of CONFIG_CC_OPTIMIZE_FOR_SIZE
kbuild: Update documentation of clean-files and clean-dirs
builddeb: Try to determine distribution
builddeb: Update year and git repository URL in debian/copyright

Linus Torvalds
2015-02-20 02:31:37 +0800

14 Feb, 2015

1 commit

5125991c9 init: remove CONFIG_INIT_FALLBACK ... Browse Code »

CONFIG_INIT_FALLBACK adds config bloat without an obvious use case that
makes it worth keeping around. Delete it.

Signed-off-by: Andy Lutomirski
Cc: Rusty Russell
Cc: Chuck Ebbert
Cc: Frank Rowand
Reviewed-by: Josh Triplett
Cc: Randy Dunlap
Cc: Rob Landley
Cc: Shuah Khan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Lutomirski
2015-02-14 13:21:42 +0800

10 Feb, 2015

1 commit

9d43bade3 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 APIC updates from Ingo Molnar:
"Continued fallout of the conversion of the x86 IRQ code to the
hierarchical irqdomain framework: more cleanups, simplifications,
memory allocation behavior enhancements, mainly in the interrupt
remapping and APIC code"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
x86, init: Fix UP boot regression on x86_64
iommu/amd: Fix irq remapping detection logic
x86/acpi: Make acpi_[un]register_gsi_ioapic() depend on CONFIG_X86_LOCAL_APIC
x86: Consolidate boot cpu timer setup
x86/apic: Reuse apic_bsp_setup() for UP APIC setup
x86/smpboot: Sanitize uniprocessor init
x86/smpboot: Move apic init code to apic.c
init: Get rid of x86isms
x86/apic: Move apic_init_uniprocessor code
x86/smpboot: Cleanup ioapic handling
x86/apic: Sanitize ioapic handling
x86/ioapic: Add proper checks to setp/enable_IO_APIC()
x86/ioapic: Provide stub functions for IOAPIC%3Dn
x86/smpboot: Move smpboot inlines to code
x86/x2apic: Use state information for disable
x86/x2apic: Split enable and setup function
x86/x2apic: Disable x2apic from nox2apic setup
x86/x2apic: Add proper state tracking
x86/x2apic: Clarify remapping mode for x2apic enablement
x86/x2apic: Move code in conditional region
...

Linus Torvalds
2015-02-10 08:57:56 +0800

22 Jan, 2015

1 commit

30b8b0066 init: Get rid of x86isms ... Browse Code »
26

The UP local API support can be set up from an early initcall. No need
for horrible hackery in the init code.

Signed-off-by: Thomas Gleixner
Cc: Jiang Liu
Cc: Joerg Roedel
Cc: Tony Luck
Cc: Borislav Petkov
Link: http://lkml.kernel.org/r/20150115211703.827943883@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2015-01-22 22:10:56 +0800

16 Jan, 2015

2 commits

78e691f4a Merge branches 'doc.2015.01.07a', 'fixes.2015.01.15a', 'preempt.2015.01.06a', 's… ... Browse Code »

…rcu.2015.01.06a', 'stall.2015.01.16a' and 'torture.2015.01.11a' into HEAD

doc.2015.01.07a: Documentation updates.
fixes.2015.01.15a: Miscellaneous fixes.
preempt.2015.01.06a: Changes to handling of lists of preempted tasks.
srcu.2015.01.06a: SRCU updates.
stall.2015.01.16a: RCU CPU stall-warning updates and fixes.
torture.2015.01.11a: RCU torture-test updates and fixes.

Paul E. McKenney
2015-01-16 15:34:34 +0800
a94844b22 rcu: Optionally run grace-period kthreads at real-time priority ... Browse Code »

Recent testing has shown that under heavy load, running RCU's grace-period
kthreads at real-time priority can improve performance (according to 0day
test robot) and reduce the incidence of RCU CPU stall warnings. However,
most systems do just fine with the default non-realtime priorities for
these kthreads, and it does not make sense to expose the entire user
base to any risk stemming from this change, given that this change is
of use only to a few users running extremely heavy workloads.

Therefore, this commit allows users to specify realtime priorities
for the grace-period kthreads, but leaves them running SCHED_OTHER
by default. The realtime priority may be specified at build time
via the RCU_KTHREAD_PRIO Kconfig parameter, or at boot time via the
rcutree.kthread_prio parameter. Either way, 0 says to continue the
default SCHED_OTHER behavior and values from 1-99 specify that priority
of SCHED_FIFO behavior. Note that a value of 0 is not permitted when
the RCU_BOOST Kconfig parameter is specified.

Signed-off-by: Paul E. McKenney

Paul E. McKenney
2015-01-16 15:25:04 +0800

08 Jan, 2015

1 commit

31a4af7f7 kbuild: trivial - fix the help doc of CONFIG_CC_OPTIMIZE_FOR_SIZE ... Browse Code »

Other than GCC, we have another choice, Clang for building the kernel
these days. It seems better to say "compiler" rather than "gcc".

Signed-off-by: Masahiro Yamada
Signed-off-by: Michal Marek

Masahiro Yamada
2015-01-08 21:53:50 +0800

07 Jan, 2015

3 commits

6341e62b2 kconfig: use bool instead of boolean for type definition attributes ... Browse Code »

Support for keyword 'boolean' will be dropped later on.

No functional change.

Reference: http://lkml.kernel.org/r/cover.1418003065.git.cj@linux.com
Signed-off-by: Christoph Jaeger
Signed-off-by: Michal Marek

Christoph Jaeger
2015-01-07 20:08:04 +0800
83fe27ea5 rcu: Make SRCU optional by using CONFIG_SRCU ... Browse Code »

SRCU is not necessary to be compiled by default in all cases. For tinification
efforts not compiling SRCU unless necessary is desirable.

The current patch tries to make compiling SRCU optional by introducing a new
Kconfig option CONFIG_SRCU which is selected when any of the components making
use of SRCU are selected.

If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.

text data bss dec hex filename
2007 0 0 2007 7d7 kernel/rcu/srcu.o

Size of arch/powerpc/boot/zImage changes from

text data bss dec hex filename
831552 64180 23944 919676 e087c arch/powerpc/boot/zImage : before
829504 64180 23952 917636 e0084 arch/powerpc/boot/zImage : after

so the savings are about ~2000 bytes.

Signed-off-by: Pranith Kumar
CC: Paul E. McKenney
CC: Josh Triplett
CC: Lai Jiangshan
Signed-off-by: Paul E. McKenney
[ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]

Pranith Kumar
2015-01-07 03:04:29 +0800
5a43b88e9 rcu: Remove "select IRQ_WORK" from config TREE_RCU ... Browse Code »

The 48a7639ce80c ("rcu: Make callers awaken grace-period kthread")
removed the irq_work_queue(), so the TREE_RCU doesn't need
irq work any more. This commit therefore updates RCU's Kconfig and

Signed-off-by: Lai Jiangshan
Signed-off-by: Paul E. McKenney

Lai Jiangshan
2015-01-07 03:01:17 +0800

17 Dec, 2014

3 commits

10975933d init: fix read-write root mount ... Browse Code »

If mount flags don't have MS_RDONLY, iso9660 returns EACCES without actually
checking if it's an iso image.

This tricks mount_block_root() into retrying with MS_RDONLY. This results
in a read-only root despite the "rw" boot parameter if the actual
filesystem was checked after iso9660.

I believe the behavior of iso9660 is okay, while that of mount_block_root()
is not. It should rather try all types without MS_RDONLY and only then
retry with MS_RDONLY.

This change also makes the code more robust against the case when EACCES is
returned despite MS_RDONLY, which would've resulted in a lockup.

Signed-off-by: Miklos Szeredi
Signed-off-by: Al Viro

Miklos Szeredi
2014-12-17 21:27:14 +0800
603ba7e41 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs pile #2 from Al Viro:
"Next pile (and there'll be one or two more).

The large piece in this one is getting rid of /proc/*/ns/* weirdness;
among other things, it allows to (finally) make nameidata completely
opaque outside of fs/namei.c, making for easier further cleanups in
there"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coda_venus_readdir(): use file_inode()
fs/namei.c: fold link_path_walk() call into path_init()
path_init(): don't bother with LOOKUP_PARENT in argument
fs/namei.c: new helper (path_cleanup())
path_init(): store the "base" pointer to file in nameidata itself
make default ->i_fop have ->open() fail with ENXIO
make nameidata completely opaque outside of fs/namei.c
kill proc_ns completely
take the targets of /proc/*/ns/* symlinks to separate fs
bury struct proc_ns in fs/proc
copy address of proc_ns_ops into ns_common
new helpers: ns_alloc_inum/ns_free_inum
make proc_ns_operations work with struct ns_common * instead of void *
switch the rest of proc_ns_operations to working with &...->ns
netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
common object embedded into various struct ....ns

Linus Torvalds
2014-12-17 07:53:03 +0800
a7c180aa7 Merge tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"As the merge window is still open, and this code was not as complex as
I thought it might be. I'm pushing this in now.

This will allow Thomas to debug his irq work for 3.20.

This adds two new features:

1) Allow traceopoints to be enabled right after mm_init().

By passing in the trace_event= kernel command line parameter,
tracepoints can be enabled at boot up. For debugging things like
the initialization of interrupts, it is needed to have tracepoints
enabled very early. People have asked about this before and this
has been on my todo list. As it can be helpful for Thomas to debug
his upcoming 3.20 IRQ work, I'm pushing this now. This way he can
add tracepoints into the IRQ set up and have users enable them when
things go wrong.

2) Have the tracepoints printed via printk() (the console) when they
are triggered.

If the irq code locks up or reboots the box, having the tracepoint
output go into the kernel ring buffer is useless for debugging.
But being able to add the tp_printk kernel command line option
along with the trace_event= option will have these tracepoints
printed as they occur, and that can be really useful for debugging
early lock up or reboot problems.

This code is not that intrusive and it passed all my tests. Thomas
tried them out too and it works for his needs.

Link: http://lkml.kernel.org/r/20141214201609.126831471@goodmis.org"

* tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Add tp_printk cmdline to have tracepoints go to printk()
tracing: Move enabling tracepoints to just after rcu_init()

Linus Torvalds
2014-12-17 04:53:59 +0800

15 Dec, 2014

2 commits

5f893b263 tracing: Move enabling tracepoints to just after rcu_init() ... Browse Code »
13

Enabling tracepoints at boot up can be very useful. The tracepoint
can be initialized right after RCU has been. There's no need to
wait for the early_initcall() to be called. That's too late for some
things that can use tracepoints for debugging. Move the logic to
enable tracepoints out of the initcalls and into init/main.c to
right after rcu_init().

This also allows trace_printk() to be used early too.

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos
Link: http://lkml.kernel.org/r/20141214164104.307127356@goodmis.org

Reviewed-by: Paul E. McKenney
Suggested-by: Thomas Gleixner
Tested-by: Thomas Gleixner
Acked-by: Thomas Gleixner
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2014-12-15 23:16:50 +0800
67e2c3883 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security layer updates from James Morris:
"In terms of changes, there's general maintenance to the Smack,
SELinux, and integrity code.

The IMA code adds a new kconfig option, IMA_APPRAISE_SIGNED_INIT,
which allows IMA appraisal to require signatures. Support for reading
keys from rootfs before init is call is also added"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
selinux: Remove security_ops extern
security: smack: fix out-of-bounds access in smk_parse_smack()
VFS: refactor vfs_read()
ima: require signature based appraisal
integrity: provide a hook to load keys when rootfs is ready
ima: load x509 certificate from the kernel
integrity: provide a function to load x509 certificate from the kernel
integrity: define a new function integrity_read_file()
Security: smack: replace kzalloc with kmem_cache for inode_smack
Smack: Lock mode for the floor and hat labels
ima: added support for new kernel cmdline parameter ima_template_fmt
ima: allocate field pointers array on demand in template_desc_init_fields()
ima: don't allocate a copy of template_fmt in template_desc_init_fields()
ima: display template format in meas. list if template name length is zero
ima: added error messages to template-related functions
ima: use atomic bit operations to protect policy update interface
ima: ignore empty and with whitespaces policy lines
ima: no need to allocate entry for comment
ima: report policy load status
ima: use path names cache
...

Linus Torvalds
2014-12-15 12:36:37 +0800

14 Dec, 2014

1 commit

eefa864b7 mm/page_ext: resurrect struct page extending code for debugging ... Browse Code »

When we debug something, we'd like to insert some information to every
page. For this purpose, we sometimes modify struct page itself. But,
this has drawbacks. First, it requires re-compile. This makes us
hesitate to use the powerful debug feature so development process is
slowed down. And, second, sometimes it is impossible to rebuild the
kernel due to third party module dependency. At third, system behaviour
would be largely different after re-compile, because it changes size of
struct page greatly and this structure is accessed by every part of
kernel. Keeping this as it is would be better to reproduce errornous
situation.

This feature is intended to overcome above mentioned problems. This
feature allocates memory for extended data per page in certain place
rather than the struct page itself. This memory can be accessed by the
accessor functions provided by this code. During the boot process, it
checks whether allocation of huge chunk of memory is needed or not. If
not, it avoids allocating memory at all. With this advantage, we can
include this feature into the kernel in default and can avoid rebuild and
solve related problems.

Until now, memcg uses this technique. But, now, memcg decides to embed
their variable to struct page itself and it's code to extend struct page
has been removed. I'd like to use this code to develop debug feature, so
this patch resurrect it.

To help these things to work well, this patch introduces two callbacks for
clients. One is the need callback which is mandatory if user wants to
avoid useless memory allocation at boot-time. The other is optional, init
callback, which is used to do proper initialization after memory is
allocated. Detailed explanation about purpose of these functions is in
code comment. Please refer it.

Others are completely same with previous extension code in memcg.

Signed-off-by: Joonsoo Kim
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Minchan Kim
Cc: Dave Hansen
Cc: Michal Nazarewicz
Cc: Jungsoo Son
Cc: Ingo Molnar
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2014-12-14 04:42:48 +0800

11 Dec, 2014

9 commits

707c5960f Merge branch 'nsfs' into for-next Browse Code »

Al Viro
2014-12-11 10:31:59 +0800
e149ed2b8 take the targets of /proc/*/ns/* symlinks to separate fs ... Browse Code »

New pseudo-filesystem: nsfs. Targets of /proc/*/ns/* live there now.
It's not mountable (not even registered, so it's not in /proc/filesystems,
etc.). Files on it *are* bindable - we explicitly permit that in do_loopback().

This stuff lives in fs/nsfs.c now; proc_ns_fget() moved there as well.
get_proc_ns() is a macro now (it's simply returning ->i_private; would
have been an inline, if not for header ordering headache).
proc_ns_inode() is an ex-parrot. The interface used in procfs is
ns_get_path(path, task, ops) and ns_get_name(buf, size, task, ops).

Dentries and inodes are never hashed; a non-counting reference to dentry
is stashed in ns_common (removed by ->d_prune()) and reused by ns_get_path()
if present. See ns_get_path()/ns_prune_dentry/nsfs_evict() for details
of that mechanism.

As the result, proc_ns_follow_link() has stopped poking in nd->path.mnt;
it does nd_jump_link() on a consistent pair it gets
from ns_get_path().

Signed-off-by: Al Viro

Al Viro
2014-12-11 10:30:20 +0800
6ef4536e2 init: allow CONFIG_INIT_FALLBACK=n to disable defaults if init= fails ... Browse Code »

If a user puts init=/whatever on the command line and /whatever can't be
run, then the kernel will try a few default options before giving up. If
init=/whatever came from a bootloader prompt, then this is unexpected but
probably harmless. On the other hand, if it comes from a script (e.g. a
tool like virtme or perhaps a future kselftest script), then the fallbacks
are likely to exist, but they'll do the wrong thing. For example, they
might unexpectedly invoke systemd.

This adds a config option CONFIG_INIT_FALLBACK. If unset, then a failure
to run the specified init= process be fatal.

The tentative plan is to remove CONFIG_INIT_FALLBACK for 3.20.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andy Lutomirski
Cc: Rob Landley
Cc: Chuck Ebbert
Cc: Randy Dunlap
Cc: Shuah Khan
Cc: Frank Rowand
Cc: Josh Triplett
Acked-by: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Lutomirski
2014-12-11 09:41:12 +0800
9edad6ea0 mm: move page->mem_cgroup bad page handling into generic code ... Browse Code »

Now that the external page_cgroup data structure and its lookup is
gone, let the generic bad_page() check for page->mem_cgroup sanity.

Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: Vladimir Davydov
Acked-by: David S. Miller
Cc: KAMEZAWA Hiroyuki
Cc: "Kirill A. Shutemov"
Cc: Tejun Heo
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-12-11 09:41:09 +0800
1306a85ae mm: embed the memcg pointer directly into struct page ... Browse Code »

Memory cgroups used to have 5 per-page pointers. To allow users to
disable that amount of overhead during runtime, those pointers were
allocated in a separate array, with a translation layer between them and
struct page.

There is now only one page pointer remaining: the memcg pointer, that
indicates which cgroup the page is associated with when charged. The
complexity of runtime allocation and the runtime translation overhead is
no longer justified to save that *potential* 0.19% of memory. With
CONFIG_SLUB, page->mem_cgroup actually sits in the doubleword padding
after the page->private member and doesn't even increase struct page,
and then this patch actually saves space. Remaining users that care can
still compile their kernels without CONFIG_MEMCG.

text data bss dec hex filename
8828345 1725264 983040 11536649 b00909 vmlinux.old
8827425 1725264 966656 11519345 afc571 vmlinux.new

[mhocko@suse.cz: update Documentation/cgroups/memory.txt]
Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: Vladimir Davydov
Acked-by: David S. Miller
Acked-by: KAMEZAWA Hiroyuki
Cc: "Kirill A. Shutemov"
Cc: Michal Hocko
Cc: Vladimir Davydov
Cc: Tejun Heo
Cc: Joonsoo Kim
Acked-by: Konstantin Khlebnikov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-12-11 09:41:09 +0800
6f7c97e80 mm/numa balancing: rearrange Kconfig entry ... Browse Code »

Add the default enable config option after the NUMA_BALANCING option so
that it appears related in the nconfig interface.

Signed-off-by: Aneesh Kumar K.V
Acked-by: David Rientjes
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Aneesh Kumar K.V
2014-12-11 09:41:06 +0800
5b1efc027 kernel: res_counter: remove the unused API ... Browse Code »

All memory accounting and limiting has been switched over to the
lockless page counters. Bye, res_counter!

[akpm@linux-foundation.org: update Documentation/cgroups/memory.txt]
[mhocko@suse.cz: ditch the last remainings of res_counter]
Signed-off-by: Johannes Weiner
Acked-by: Vladimir Davydov
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: David Rientjes
Cc: Paul Bolle
Signed-off-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-12-11 09:41:04 +0800
71f87bee3 mm: hugetlb_cgroup: convert to lockless page counters ... Browse Code »

Abandon the spinlock-protected byte counters in favor of the unlocked
page counters in the hugetlb controller as well.

Signed-off-by: Johannes Weiner
Reviewed-by: Vladimir Davydov
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-12-11 09:41:04 +0800
3e32cb2e0 mm: memcontrol: lockless page counters ... Browse Code »
34

Memory is internally accounted in bytes, using spinlock-protected 64-bit
counters, even though the smallest accounting delta is a page. The
counter interface is also convoluted and does too many things.

Introduce a new lockless word-sized page counter API, then change all
memory accounting over to it. The translation from and to bytes then only
happens when interfacing with userspace.

The removed locking overhead is noticable when scaling beyond the per-cpu
charge caches - on a 4-socket machine with 144-threads, the following test
shows the performance differences of 288 memcgs concurrently running a
page fault benchmark:

vanilla:

18631648.500498 task-clock (msec) # 140.643 CPUs utilized ( +- 0.33% )
1,380,638 context-switches # 0.074 K/sec ( +- 0.75% )
24,390 cpu-migrations # 0.001 K/sec ( +- 8.44% )
1,843,305,768 page-faults # 0.099 M/sec ( +- 0.00% )
50,134,994,088,218 cycles # 2.691 GHz ( +- 0.33% )
stalled-cycles-frontend
stalled-cycles-backend
8,049,712,224,651 instructions # 0.16 insns per cycle ( +- 0.04% )
1,586,970,584,979 branches # 85.176 M/sec ( +- 0.05% )
1,724,989,949 branch-misses # 0.11% of all branches ( +- 0.48% )

132.474343877 seconds time elapsed ( +- 0.21% )

lockless:

12195979.037525 task-clock (msec) # 133.480 CPUs utilized ( +- 0.18% )
832,850 context-switches # 0.068 K/sec ( +- 0.54% )
15,624 cpu-migrations # 0.001 K/sec ( +- 10.17% )
1,843,304,774 page-faults # 0.151 M/sec ( +- 0.00% )
32,811,216,801,141 cycles # 2.690 GHz ( +- 0.18% )
stalled-cycles-frontend
stalled-cycles-backend
9,999,265,091,727 instructions # 0.30 insns per cycle ( +- 0.10% )
2,076,759,325,203 branches # 170.282 M/sec ( +- 0.12% )
1,656,917,214 branch-misses # 0.08% of all branches ( +- 0.55% )

91.369330729 seconds time elapsed ( +- 0.45% )

On top of improved scalability, this also gets rid of the icky long long
types in the very heart of memcg, which is great for 32 bit and also makes
the code a lot more readable.

Notable differences between the old and new API:

- res_counter_charge() and res_counter_charge_nofail() become
page_counter_try_charge() and page_counter_charge() resp. to match
the more common kernel naming scheme of try_do()/do()

- res_counter_uncharge_until() is only ever used to cancel a local
counter and never to uncharge bigger segments of a hierarchy, so
it's replaced by the simpler page_counter_cancel()

- res_counter_set_limit() is replaced by page_counter_limit(), which
expects its callers to serialize against themselves

- res_counter_memparse_write_strategy() is replaced by
page_counter_limit(), which rounds down to the nearest page size -
rather than up. This is more reasonable for explicitely requested
hard upper limits.

- to keep charging light-weight, page_counter_try_charge() charges
speculatively, only to roll back if the result exceeds the limit.
Because of this, a failing bigger charge can temporarily lock out
smaller charges that would otherwise succeed. The error is bounded
to the difference between the smallest and the biggest possible
charge size, so for memcg, this means that a failing THP charge can
send base page charges into reclaim upto 2MB (4MB) before the limit
would have been reached. This should be acceptable.

[akpm@linux-foundation.org: add includes for WARN_ON_ONCE and memparse]
[akpm@linux-foundation.org: add includes for WARN_ON_ONCE, memparse, strncmp, and PAGE_SIZE]
Signed-off-by: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: Vladimir Davydov
Cc: Tejun Heo
Cc: David Rientjes
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-12-11 09:41:04 +0800

05 Dec, 2014

2 commits

33c429405 copy address of proc_ns_ops into ns_common ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-12-05 03:34:47 +0800
435d5f4bb common object embedded into various struct ....ns ... Browse Code »

for now - just move corresponding ->proc_inum instances over there

Acked-by: "Eric W. Biederman"
Signed-off-by: Al Viro

Al Viro
2014-12-05 03:31:00 +0800

20 Nov, 2014

1 commit

d360b78f9 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck… ... Browse Code »

…/linux-rcu into core/rcu

Pull RCU updates from Paul E. McKenney:

- Streamline RCU's use of per-CPU variables, shifting from "cpu"
arguments to functions to "this_"-style per-CPU variable accessors.

- Signal-handling RCU updates.

- Real-time updates.

- Torture-test updates.

- Miscellaneous fixes.

- Documentation updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2014-11-20 15:57:58 +0800

19 Nov, 2014

1 commit

a6aacbde4 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity into next Browse Code »

James Morris
2014-11-19 18:36:07 +0800

18 Nov, 2014

1 commit

c9cd2ce2b integrity: provide a hook to load keys when rootfs is ready ... Browse Code »

Keys can only be loaded once the rootfs is mounted. Initcalls
are not suitable for that. This patch defines a special hook
to load the x509 public keys onto the IMA keyring, before
attempting to access any file. The keys are required for
verifying the file's signature. The hook is called after the
root filesystem is mounted and before the kernel calls 'init'.

Changes in v3:
* added more explanation to the patch description (Mimi)

Changes in v2:
* Hook renamed as 'integrity_load_keys()' to handle both IMA and EVM
keys by integrity subsystem.
* Hook patch moved after defining loading functions

Signed-off-by: Dmitry Kasatkin
Signed-off-by: Mimi Zohar

Dmitry Kasatkin
2014-11-18 12:12:01 +0800

11 Nov, 2014

1 commit

3438cf549 param: fix crash on bad kernel arguments ... Browse Code »

Currently if the user passes an invalid value on the kernel command line
then the kernel will crash during argument parsing. On most systems this
is very hard to debug because the console hasn't been initialized yet.

This is a regression due to commit 51e158c12aca ("param: hand arguments
after -- straight to init") which, in response to the systemd debug
controversy, made it possible to explicitly pass arguments to init. To
achieve this parse_args() was extended from simply returning an error
code to returning a pointer. Regretably the new init args logic does not
perform a proper validity check on the pointer resulting in a crash.

This patch fixes the validity check. Should the check fail then no arguments
will be passed to init. This is reasonable and matches how the kernel treats
its own arguments (i.e. no error recovery).

Signed-off-by: Daniel Thompson
Cc: stable@vger.kernel.org
Signed-off-by: Rusty Russell

Daniel Thompson
2014-11-11 14:33:19 +0800

30 Oct, 2014

2 commits

28f6569ab rcu: Remove redundant TREE_PREEMPT_RCU config option ... Browse Code »

PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after
TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU
and uses PREEMPT_RCU config option in its place.

Signed-off-by: Pranith Kumar
Signed-off-by: Paul E. McKenney

Pranith Kumar
2014-10-30 01:20:05 +0800
21871d7ef rcu: Unify boost and kthread priorities ... Browse Code »

Rename CONFIG_RCU_BOOST_PRIO to CONFIG_RCU_KTHREAD_PRIO and use this
value for both the per-CPU kthreads (rcuc/N) and the rcu boosting
threads (rcub/n).

Also, create the module_parameter rcutree.kthread_prio to be used on
the kernel command line at boot to set a new value (rcutree.kthread_prio=N).

Signed-off-by: Clark Williams
[ paulmck: Ported to rcu/dev, applied Paul Bolle and Peter Zijlstra feedback. ]
Signed-off-by: Paul E. McKenney

Clark Williams
2014-10-30 01:19:41 +0800

29 Oct, 2014

1 commit

4568779f7 init/Kconfig: move RCU_NOCB_CPU dependencies to choice ... Browse Code »

Every choice item of the "Build-forced no-CBs CPUs" choice had a
dependency to RCU_NOCB_CPU. It's more comprehensible if the choice
itself has the dependency instead of every choice item. The choice
itself doesn't need to be visible if there are no items selectable
(i.e. on arch/frv) or RCU_NOCB_CPU is not defined.

Signed-off-by: Stefan Hengelein
Signed-off-by: Andreas Ruprecht
Reviewed-by: Josh Triplett
Signed-off-by: Andrew Morton
Signed-off-by: Paul E. McKenney

Stefan Hengelein
2014-10-29 04:49:27 +0800

28 Oct, 2014

1 commit

f89b7755f bpf: split eBPF out of NET ... Browse Code »

introduce two configs:
- hidden CONFIG_BPF to select eBPF interpreter that classic socket filters
depend on
- visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use

that solves several problems:
- tracing and others that wish to use eBPF don't need to depend on NET.
They can use BPF_SYSCALL to allow loading from userspace or select BPF
to use it directly from kernel in NET-less configs.
- in 3.18 programs cannot be attached to events yet, so don't force it on
- when the rest of eBPF infra is there in 3.19+, it's still useful to
switch it off to minimize kernel size

bloat-o-meter on x64 shows:
add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601)

tested with many different config combinations. Hopefully didn't miss anything.

Signed-off-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

Alexei Starovoitov
2014-10-28 07:09:59 +0800

14 Oct, 2014

3 commits

63a12d9d0 kernel/param: consolidate __{start,stop}___param[] in <linux/moduleparam.h> ... Browse Code »

Consolidate the various external const and non-const declarations of
__start___param[] and __stop___param in . This
requires making a few struct kernel_param pointers in kernel/params.c
const.

Signed-off-by: Geert Uytterhoeven
Acked-by: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2014-10-14 08:18:28 +0800
c34d85aca init/initramfs.c: resolve shadow warnings ... Browse Code »

Resolve shadow warnings that are produced in W=2 builds by renaming a
global with a too-generic name and renaming a formal parameter.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Mark Rustad
Signed-off-by: Jeff Kirsher
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mark Rustad
2014-10-14 08:18:22 +0800
2240a31db printk: don't bother using LOG_CPU_MAX_BUF_SHIFT on !SMP ... Browse Code »

When configuring a uniprocessor kernel, don't bother the user with an
irrelevant LOG_CPU_MAX_BUF_SHIFT question, and don't build the unused
code.

Signed-off-by: Geert Uytterhoeven
Acked-by: Luis R. Rodriguez
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2014-10-14 08:18:12 +0800

13 Oct, 2014

1 commit

faafcba3b Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:

- Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave
Hansen)

- Various sched/idle refinements for better idle handling (Nicolas
Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot)

- sched/numa updates and optimizations (Rik van Riel)

- sysbench speedup (Vincent Guittot)

- capacity calculation cleanups/refactoring (Vincent Guittot)

- Various cleanups to thread group iteration (Oleg Nesterov)

- Double-rq-lock removal optimization and various refactorings
(Kirill Tkhai)

- various sched/deadline fixes

... and lots of other changes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
sched/fair: Delete resched_cpu() from idle_balance()
sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
sched: Improve sysbench performance by fixing spurious active migration
sched/x86: Fix up typo in topology detection
x86, sched: Add new topology for multi-NUMA-node CPUs
sched/rt: Use resched_curr() in task_tick_rt()
sched: Use rq->rd in sched_setaffinity() under RCU read lock
sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'
sched: Use dl_bw_of() under RCU read lock
sched/fair: Remove duplicate code from can_migrate_task()
sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
sched: print_rq(): Don't use tasklist_lock
sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock()
sched: Fix the task-group check in tg_has_rt_tasks()
sched/fair: Leverage the idle state info when choosing the "idlest" cpu
sched: Let the scheduler see CPU idle states
sched/deadline: Fix inter- exclusive cpusets migrations
sched/deadline: Clear dl_entity params when setscheduling to different class
sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault()
...

Linus Torvalds
2014-10-13 22:23:15 +0800