Doug / smarc-fsl-linux-kernel | Embedian Git Server

13 Mar, 2013

1 commit

3d374d09f final removal of CONFIG_EXPERIMENTAL ... Browse Code »

Remove "config EXPERIMENTAL" itself, now that every "depends on" it has
been removed from the tree.

Signed-off-by: Kees Cook
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2013-03-13 07:30:27 +0800

02 Mar, 2013

1 commit

e23b62256 Merge tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc ... Browse Code »

Pull new ARC architecture from Vineet Gupta:
"Initial ARC Linux port with some fixes on top for 3.9-rc1:

I would like to introduce the Linux port to ARC Processors (from
Synopsys) for 3.9-rc1. The patch-set has been discussed on the public
lists since Nov and has received a fair bit of review, specially from
Arnd, tglx, Al and other subsystem maintainers for DeviceTree, kgdb...

The arch bits are in arch/arc, some asm-generic changes (acked by
Arnd), a minor change to PARISC (acked by Helge).

The series is a touch bigger for a new port for 2 main reasons:

1. It enables a basic kernel in first sub-series and adds
ptrace/kgdb/.. later

2. Some of the fallout of review (DeviceTree support, multi-platform-
image support) were added on top of orig series, primarily to
record the revision history.

This updated pull request additionally contains

- fixes due to our GNU tools catching up with the new syscall/ptrace
ABI

- some (minor) cross-arch Kconfig updates."

* tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (82 commits)
ARC: split elf.h into uapi and export it for userspace
ARC: Fixup the current ABI version
ARC: gdbserver using regset interface possibly broken
ARC: Kconfig cleanup tracking cross-arch Kconfig pruning in merge window
ARC: make a copy of flat DT
ARC: [plat-arcfpga] DT arc-uart bindings change: "baud" => "current-speed"
ARC: Ensure CONFIG_VIRT_TO_BUS is not enabled
ARC: Fix pt_orig_r8 access
ARC: [3.9] Fallout of hlist iterator update
ARC: 64bit RTSC timestamp hardware issue
ARC: Don't fiddle with non-existent caches
ARC: Add self to MAINTAINERS
ARC: Provide a default serial.h for uart drivers needing BASE_BAUD
ARC: [plat-arcfpga] defconfig for fully loaded ARC Linux
ARC: [Review] Multi-platform image #8: platform registers SMP callbacks
ARC: [Review] Multi-platform image #7: SMP common code to use callbacks
ARC: [Review] Multi-platform image #6: cpu-to-dma-addr optional
ARC: [Review] Multi-platform image #5: NR_IRQS defined by ARC core
ARC: [Review] Multi-platform image #4: Isolate platform headers
ARC: [Review] Multi-platform image #3: switch to board callback
...

Linus Torvalds
2013-03-02 23:58:56 +0800

26 Feb, 2013

2 commits

94f2f1423 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace and namespace infrastructure changes from Eric W Biederman:
"This set of changes starts with a few small enhnacements to the user
namespace. reboot support, allowing more arbitrary mappings, and
support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
user namespace root.

I do my best to document that if you care about limiting your
unprivileged users that when you have the user namespace support
enabled you will need to enable memory control groups.

There is a minor bug fix to prevent overflowing the stack if someone
creates way too many user namespaces.

The bulk of the changes are a continuation of the kuid/kgid push down
work through the filesystems. These changes make using uids and gids
typesafe which ensures that these filesystems are safe to use when
multiple user namespaces are in use. The filesystems converted for
3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The
changes for these filesystems were a little more involved so I split
the changes into smaller hopefully obviously correct changes.

XFS is the only filesystem that remains. I was hoping I could get
that in this release so that user namespace support would be enabled
with an allyesconfig or an allmodconfig but it looks like the xfs
changes need another couple of days before it they are ready."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
cifs: Enable building with user namespaces enabled.
cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
cifs: Convert struct cifs_sb_info to use kuids and kgids
cifs: Modify struct smb_vol to use kuids and kgids
cifs: Convert struct cifsFileInfo to use a kuid
cifs: Convert struct cifs_fattr to use kuid and kgids
cifs: Convert struct tcon_link to use a kuid.
cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
cifs: Convert from a kuid before printing current_fsuid
cifs: Use kuids and kgids SID to uid/gid mapping
cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
cifs: Override unmappable incoming uids and gids
nfsd: Enable building with user namespaces enabled.
nfsd: Properly compare and initialize kuids and kgids
nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
nfsd: Modify nfsd4_cb_sec to use kuids and kgids
nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
nfsd: Convert nfsxdr to use kuids and kgids
nfsd: Convert nfs3xdr to use kuids and kgids
...

Linus Torvalds
2013-02-26 08:00:49 +0800
9043a2650 Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module update from Rusty Russell:
"The sweeping change is to make add_taint() explicitly indicate whether
to disable lockdep, but it's a mechanical change."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Add option to not sign modules during modules_install
MODSIGN: Add -s option to sign-file
MODSIGN: Specify the hash algorithm on sign-file command line
MODSIGN: Simplify Makefile with a Kconfig helper
module: clean up load_module a little more.
modpost: Ignore ARC specific non-alloc sections
module: constify within_module_*
taint: add explicit flag to show whether lock dep is still OK.
module: printk message when module signature fail taints kernel.

Linus Torvalds
2013-02-26 07:41:43 +0800

22 Feb, 2013

2 commits

27ea6dfdc Merge tag 'please-pull-misc-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux ... Browse Code »

Pull misc ia64 bits from Tony Luck.

* tag 'please-pull-misc-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
MAINTAINERS: update SGI & ia64 Altix stuff
sysctl: Enable IA64 "ignore-unaligned-usertrap" to be used cross-arch

Linus Torvalds
2013-02-22 09:55:48 +0800
06991c28f Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core patches from Greg Kroah-Hartman:
"Here is the big driver core merge for 3.9-rc1

There are two major series here, both of which touch lots of drivers
all over the kernel, and will cause you some merge conflicts:

- add a new function called devm_ioremap_resource() to properly be
able to check return values.

- remove CONFIG_EXPERIMENTAL

Other than those patches, there's not much here, some minor fixes and
updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
base: memory: fix soft/hard_offline_page permissions
drivercore: Fix ordering between deferred_probe and exiting initcalls
backlight: fix class_find_device() arguments
TTY: mark tty_get_device call with the proper const values
driver-core: constify data for class_find_device()
firmware: Ignore abort check when no user-helper is used
firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
firmware: Make user-mode helper optional
firmware: Refactoring for splitting user-mode helper code
Driver core: treat unregistered bus_types as having no devices
watchdog: Convert to devm_ioremap_resource()
thermal: Convert to devm_ioremap_resource()
spi: Convert to devm_ioremap_resource()
power: Convert to devm_ioremap_resource()
mtd: Convert to devm_ioremap_resource()
mmc: Convert to devm_ioremap_resource()
mfd: Convert to devm_ioremap_resource()
media: Convert to devm_ioremap_resource()
iommu: Convert to devm_ioremap_resource()
drm: Convert to devm_ioremap_resource()
...

Linus Torvalds
2013-02-22 04:05:51 +0800

20 Feb, 2013

2 commits

d652e1eb8 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler changes from Ingo Molnar:
"Main changes:

- scheduler side full-dynticks (user-space execution is undisturbed
and receives no timer IRQs) preparation changes that convert the
cputime accounting code to be full-dynticks ready, from Frederic
Weisbecker.

- Initial sched.h split-up changes, by Clark Williams

- select_idle_sibling() performance improvement by Mike Galbraith:

" 1 tbench pair (worst case) in a 10 core + SMT package:

pre 15.22 MB/sec 1 procs
post 252.01 MB/sec 1 procs "

- sched_rr_get_interval() ABI fix/change. We think this detail is not
used by apps (so it's not an ABI in practice), but lets keep it
under observation.

- misc RT scheduling cleanups, optimizations"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched/rt: Add header to
cputime: Remove irqsave from seqlock readers
sched, powerpc: Fix sched.h split-up build failure
cputime: Restore CPU_ACCOUNTING config defaults for PPC64
sched/rt: Move rt specific bits into new header file
sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
sched: Move sched.h sysctl bits into separate header
sched: Fix signedness bug in yield_to()
sched: Fix select_idle_sibling() bouncing cow syndrome
sched/rt: Further simplify pick_rt_task()
sched/rt: Do not account zero delta_exec in update_curr_rt()
cputime: Safely read cputime of full dynticks CPUs
kvm: Prepare to add generic guest entry/exit callbacks
cputime: Use accessors to read task cputime stats
cputime: Allow dynamic switch between tick/virtual based cputime accounting
cputime: Generic on-demand virtual cputime accounting
cputime: Move default nsecs_to_cputime() to jiffies based cputime file
cputime: Librarize per nsecs resolution cputime definitions
cputime: Avoid multiplication overflow on utime scaling
context_tracking: Export context state for generic vtime
...

Fix up conflict in kernel/context_tracking.c due to comment additions.

Linus Torvalds
2013-02-20 10:19:48 +0800
b7133a9a1 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq core changes from Ingo Molnar:
"The biggest changes are the IRQ-work and printk changes from Frederic
Weisbecker, which prepare the code for 'full dynticks' (the ability to
stop or slow down the periodic tick arbitrarily, not just in idle time
as today):

- Don't stop tick with irq works pending. This fix is generally
useful and concerns archs that can't raise self IPIs.

- Flush irq works before CPU offlining.

- Introduce "lazy" irq works that can wait for the next tick to be
executed, unless it's stopped.

- Implement klogd wake up using irq work. This removes the ad-hoc
printk_tick()/printk_needs_cpu() hooks and make it working even in
dynticks mode.

- Cleanups and fixes."

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Export enable/disable_percpu_irq()
arch Kconfig: Remove references to IRQ_PER_CPU
irq_work: Remove return value from the irq_work_queue() function
genirq: Avoid deadlock in spurious handling
printk: Wake up klogd using irq_work
irq_work: Make self-IPIs optable
irq_work: Warn if there's still work on cpu_down
irq_work: Flush work on CPU_DYING
irq_work: Don't stop the tick with pending works
nohz: Add API to check tick state
irq_work: Remove CONFIG_HAVE_IRQ_WORK
irq_work: Fix racy check on work pending flag
irq_work: Fix racy IRQ_WORK_BUSY flag setting

Linus Torvalds
2013-02-20 09:47:58 +0800

16 Feb, 2013

1 commit

bf14e3b97 sysctl: Enable PARISC "unaligned-trap" to be used cross-arch ... Browse Code »

PARISC defines /proc/sys/kernel/unaligned-trap to runtime toggle
unaligned access emulation.

The exact mechanics of enablig/disabling are still arch specific, we can
make the sysctl usable by other arches.

Signed-off-by: Vineet Gupta
Acked-by: Helge Deller
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Cc: "Eric W. Biederman"
Cc: Serge Hallyn

Vineet Gupta
2013-02-16 01:46:05 +0800

13 Feb, 2013

8 commits

139321c65 cifs: Enable building with user namespaces enabled. ... Browse Code »

Cc: Steve French
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 23:28:56 +0800
c9617a44b nfsd: Enable building with user namespaces enabled. ... Browse Code »

Now that the kuids and kgids conversion have propogated
through net/sunrpc/ and the fs/nfsd/ it is safe to enable
building nfsd when user namespaces are enabled.

Cc: "J. Bruce Fields"
Cc: Trond Myklebust
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:16:10 +0800
4277bbf75 nfs: Enable building with user namespaces enabled. ... Browse Code »

Now that the kuids and kgids conversion have propogated
through net/sunrpc/ and the fs/nfs/ it is safe to enable
building nfs when user namespaces are enabled.

Cc: "J. Bruce Fields"
Cc: Trond Myklebust
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:15:34 +0800
1ac7fd819 ncpfs: Support interacting with multiple user namespaces ... Browse Code »

ncpfs does not natively support uids and gids so this conversion was
simply a matter of updating the the type of the mounteduid, the uid
and the gid on the superblock. Fixing the ioctls that read them,
updating the mount option parser and the mount option printer.

Cc: Petr Vandrovec
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2013-02-13 22:15:13 +0800
0f07bd375 gfs2: Enable building with user namespaces enabled ... Browse Code »

Now that all of the necessary work has been done to push kuids and
kgids throughout gfs2 and to convert between kuids and kgids when
reading and writing the on disk structures it is safe to enable gfs2
when multiple user namespaces are enabled.

Cc: Steven Whitehouse
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:15:12 +0800
ecb528e3e ocfs2: Enable building with user namespaces enabled ... Browse Code »

Now that ocfs2 has been converted to store uids and gids in
kuid_t and kgid_t and all of the conversions have been added
to the appropriate places it is safe to allow building and
using ocfs2 with user namespace support enabled.

Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:14:32 +0800
515ee7bd9 coda: Allow coda to be built when user namespace support is enabled ... Browse Code »

Now that the coda kernel to userspace has been modified to convert
between kuids and kgids and uids and gids, and all internal
coda structures have be modified to store uids and gids as
kuids and kgids it is safe to allow code to be built with
user namespace support enabled.

Cc: Jan Harkes
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-13 22:00:55 +0800
a0a5386ac afs: Support interacting with multiple user namespaces ... Browse Code »

Modify struct afs_file_status to store owner as a kuid_t and group as
a kgid_t.

In xdr_decode_AFSFetchStatus as owner is now a kuid_t and group is now
a kgid_t don't use the EXTRACT macro. Instead perform the work of
the extract macro explicitly. Read the value with ntohl and
convert it to the appropriate type with make_kuid or make_kgid.
Test if the value is different from what is stored in status and
update changed. Update the value in status.

In xdr_encode_AFS_StoreStatus call from_kuid or from_kgid as
we are computing the on the wire encoding.

Initialize uids with GLOBAL_ROOT_UID instead of 0.
Initialize gids with GLOBAL_ROOT_GID instead of 0.

Cc: David Howells
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2013-02-13 22:00:51 +0800

12 Feb, 2013

2 commits

4fa814be2 9p: Allow building 9p with user namespaces enabled. ... Browse Code »

Now that the uid_t -> kuid_t, gid_t -> kgid_t conversion
has been completed in 9p allow 9p to be built when user
namespaces are enabled.

Cc: Eric Van Hensbergen
Cc: Ron Minnich
Cc: Latchesar Ionkov
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-12 19:19:35 +0800
d5ea055f1 ceph: Enable building when user namespaces are enabled. ... Browse Code »

Now that conversions happen from kuids and kgids when generating ceph
messages and conversion happen to kuids and kgids after receiving
celph messages, and all intermediate data structures store uids and
gids as type kuid_t and kgid_t it is safe to enable ceph with
user namespace support enabled.

Cc: Sage Weil
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2013-02-12 19:19:28 +0800

08 Feb, 2013

1 commit

02fc8d372 cputime: Restore CPU_ACCOUNTING config defaults for PPC64 ... Browse Code »

Commit abf917cd91cb ("cputime: Generic on-demand virtual cputime
accounting") inadvertantly changed the default CPU_ACCOUNTING
config for PPC64. Repair that.

Signed-off-by: Stephen Rothwell
Acked-by: Frederic Weisbecker
Cc: Li Zhong
Cc: Namhyung Kim
Cc: Paul E. McKenney
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: ppc-dev
Cc: Benjamin Herrenschmidt
Link: http://lkml.kernel.org/r/20130208141938.f31b7b9e1acac5bbe769ee4c@canb.auug.org.au
Signed-off-by: Ingo Molnar

Stephen Rothwell
2013-02-08 22:23:12 +0800

05 Feb, 2013

2 commits

077931446 Merge branch 'nohz/printk-v8' into irq/core ... Browse Code »

Conflicts:
kernel/irq_work.c

Add support for printk in full dynticks CPU.

* Don't stop tick with irq works pending. This
fix is generally useful and concerns archs that
can't raise self IPIs.

* Flush irq works before CPU offlining.

* Introduce "lazy" irq works that can wait for the
next tick to be executed, unless it's stopped.

* Implement klogd wake up using irq work. This
removes the ad-hoc printk_tick()/printk_needs_cpu()
hooks and make it working even in dynticks mode.

Signed-off-by: Frederic Weisbecker

Frederic Weisbecker
2013-02-05 07:48:46 +0800
9228b5f24 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck… ... Browse Code »

…/linux-rcu into core/rcu

Pull RCU updates from Paul E. McKenney:

1. Changes to rcutorture and to RCU documentation. Posted to LKML at
https://lkml.org/lkml/2013/1/26/188.

2. Enhancements to uniprocessor handling in tiny RCU. Posted to LKML
at https://lkml.org/lkml/2013/1/27/2.

3. Tag RCU callbacks with grace-period number to simplify callback
advancement. Posted to LKML at https://lkml.org/lkml/2013/1/26/203.

4. Miscellaneous fixes. Posted to LKML at https://lkml.org/lkml/2013/1/26/204.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2013-02-05 02:06:34 +0800

29 Jan, 2013

2 commits

9fc52d832 rcu: Allow TREE_PREEMPT_RCU on UP systems ... Browse Code »

The TINY_PREEMPT_RCU is complex, does not provide that much memory
savings, and therefore TREE_PREEMPT_RCU should be used instead. The
systems where the difference between TINY_PREEMPT_RCU and TREE_PREEMPT_RCU
are quite small compared to the memory footprint of CONFIG_PREEMPT.

This commit therefore takes a first step towards eliminating
TINY_PREEMPT_RCU by allowing TREE_PREEMPT_RCU to be configured on !SMP
systems.

Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
2013-01-29 14:06:22 +0800
6bfc09e23 rcu: Provide RCU CPU stall warnings for tiny RCU ... Browse Code »

Tiny RCU has historically omitted RCU CPU stall warnings in order to
reduce memory requirements, however, lack of these warnings caused
Thomas Gleixner some debugging pain recently. Therefore, this commit
adds RCU CPU stall warnings to tiny RCU if RCU_TRACE=y. This keeps
the memory footprint small, while still enabling CPU stall warnings
in kernels built to enable them.

Updated to include Josh Triplett's suggested use of RCU_STALL_COMMON
config variable to simplify #if expressions.

Reported-by: Thomas Gleixner
Signed-off-by: Paul E. McKenney
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Paul E. McKenney
2013-01-29 14:06:21 +0800

28 Jan, 2013

1 commit

abf917cd9 cputime: Generic on-demand virtual cputime accounting ... Browse Code »

If we want to stop the tick further idle, we need to be
able to account the cputime without using the tick.

Virtual based cputime accounting solves that problem by
hooking into kernel/user boundaries.

However implementing CONFIG_VIRT_CPU_ACCOUNTING require
low level hooks and involves more overhead. But we already
have a generic context tracking subsystem that is required
for RCU needs by archs which plan to shut down the tick
outside idle.

This patch implements a generic virtual based cputime
accounting that relies on these generic kernel/user hooks.

There are some upsides of doing this:

- This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
if context tracking is already built (already necessary for RCU in full
tickless mode).

- We can rely on the generic context tracking subsystem to dynamically
(de)activate the hooks, so that we can switch anytime between virtual
and tick based accounting. This way we don't have the overhead
of the virtual accounting when the tick is running periodically.

And one downside:

- There is probably more overhead than a native virtual based cputime
accounting. But this relies on hooks that are already set anyway.

Signed-off-by: Frederic Weisbecker
Cc: Andrew Morton
Cc: Ingo Molnar
Cc: Li Zhong
Cc: Namhyung Kim
Cc: Paul E. McKenney
Cc: Paul Gortmaker
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Thomas Gleixner

Frederic Weisbecker
2013-01-28 02:23:27 +0800

27 Jan, 2013

1 commit

e11f0ae38 userns: Recommend use of memory control groups. ... Browse Code »

In the help text describing user namespaces recommend use of memory
control groups. In many cases memory control groups are the only
mechanism there is to limit how much memory a user who can create
user namespaces can use.

Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman

Eric W. Biederman
2013-01-27 14:20:06 +0800

25 Jan, 2013

2 commits

d9d8d7ed4 MODSIGN: Add option to not sign modules during modules_install ... Browse Code »

To allow the builder to sign only a subset of modules, or to sign the
modules using a key that is not available on the build machine, add
CONFIG_MODULE_SIG_ALL. If this option is unset, no modules will be
signed during build. The default is 'y', to preserve the current
behavior.

Signed-off-by: Michal Marek
Acked-by: David Howells
Signed-off-by: Rusty Russell

Michal Marek
2013-01-25 14:25:37 +0800
227536740 MODSIGN: Simplify Makefile with a Kconfig helper ... Browse Code »

Signed-off-by: Michal Marek
Acked-by: David Howells
Signed-off-by: Rusty Russell

Michal Marek
2013-01-25 14:25:35 +0800

24 Jan, 2013

1 commit

786133f6e Merge branch 'core/irq_work' of git://git.kernel.org/pub/scm/linux/kernel/git/fr… ... Browse Code »

…ederic/linux-dynticks into irq/core

irq_work fixes and cleanups, in preparation for full dyntics support.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2013-01-24 19:48:41 +0800

18 Jan, 2013

1 commit

ed408f7c0 Merge 3.9-rc4 into driver-core-next ... Browse Code »

This is to fix up a build problem with a wireless driver due to the
dynamic-debug patches in this branch.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2013-01-18 11:48:18 +0800

17 Jan, 2013

1 commit

3a55fb0d9 Tell the world we gave up on pushing CC_OPTIMIZE_FOR_SIZE ... Browse Code »

In commit 281dc5c5ec0f ("Give up on pushing CC_OPTIMIZE_FOR_SIZE") we
already changed the actual default value, but the help-text still
suggested 'y'. Fix the help text too, for all the same reasons.

Sadly, -Os keeps on generating some very suboptimal code for certain
cases, to the point where any I$ miss upside is swamped by the downside.
The main ones are:

- using "rep movsb" for memcpy, even on CPU's where that is
horrendously bad for performance.

- not honoring branch prediction information, so any I$ footprint you
win from smaller code, you lose from less code density in the I$.

- using divide instructions when that is very expensive.

Signed-off-by: Kirill Smelkov
Signed-off-by: Linus Torvalds

Kirill Smelkov
2013-01-17 04:42:57 +0800

12 Jan, 2013

2 commits

19c923998 init: remove depends on CONFIG_EXPERIMENTAL ... Browse Code »

The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

CC: "Eric W. Biederman"
CC: Serge Hallyn
CC: "Paul E. McKenney"
CC: Andrew Morton
CC: Frederic Weisbecker
Signed-off-by: Kees Cook
Acked-by: Serge Hallyn

Kees Cook
2013-01-12 03:39:33 +0800
5a958db31 make CONFIG_EXPERIMENTAL invisible and default ... Browse Code »

This config item has not carried much meaning for a while now and is
almost always enabled by default (especially in distro builds). As agreed
during the Linux kernel summit, it should be removed. As a first step,
remove it from being listed, and default it to on. Once it has been
removed from all subsystem Kconfigs, it will be dropped entirely.

For items that really are experimental, maintainers should use "default
n", optionally include "(EXPERIMENTAL)" in the title, and add language to
the help text indicating why the item should be considered experimental.

For items that are dangerously experimental, the maintainer is encouraged
to follow the above title recommendation, add stronger language to the
help text, and optionally use (depending on the extent of the danger,
from least to most dangerous): printk(), add_taint(TAINT_WARN),
add_taint(TAINT_CRAP), WARN_ON(1), and CONFIG_BROKEN.

CC: Greg KH
CC: "Eric W. Biederman"
CC: Serge Hallyn
CC: "Paul E. McKenney"
CC: Andrew Morton
CC: Frederic Weisbecker
Signed-off-by: Kees Cook
Acked-by: Serge Hallyn
Acked-by: Greg Kroah-Hartman
Reviewed-by: Paul E. McKenney

Kees Cook
2013-01-12 03:38:03 +0800

10 Jan, 2013

1 commit

b6fca7253 sysctl: Enable IA64 "ignore-unaligned-usertrap" to be used cross-arch ... Browse Code »

IA64 defines /proc/sys/kernel/ignore-unaligned-usertrap to control
verbose warnings on unaligned access emulation.

Although the exact mechanics of what to do with sysctl (ignore/shout)
are arch specific, this change enables the sysctl to be usable cross-arch.

Signed-off-by: Vineet Gupta
Cc: Fenghua Yu
Cc: "Eric W. Biederman"
Cc: Serge Hallyn
Signed-off-by: Tony Luck

Vineet Gupta
2013-01-10 02:48:34 +0800

19 Dec, 2012

2 commits

d7f25f8a2 memcg: infrastructure to match an allocation to the right cache ... Browse Code »

The page allocator is able to bind a page to a memcg when it is
allocated. But for the caches, we'd like to have as many objects as
possible in a page belonging to the same cache.

This is done in this patch by calling memcg_kmem_get_cache in the
beginning of every allocation function. This function is patched out by
static branches when kernel memory controller is not being used.

It assumes that the task allocating, which determines the memcg in the
page allocator, belongs to the same cgroup throughout the whole process.
Misaccounting can happen if the task calls memcg_kmem_get_cache() while
belonging to a cgroup, and later on changes. This is considered
acceptable, and should only happen upon task migration.

Before the cache is created by the memcg core, there is also a possible
imbalance: the task belongs to a memcg, but the cache being allocated from
is the global cache, since the child cache is not yet guaranteed to be
ready. This case is also fine, since in this case the GFP_KMEMCG will not
be passed and the page allocator will not attempt any cgroup accounting.

Signed-off-by: Glauber Costa
Cc: Christoph Lameter
Cc: David Rientjes
Cc: Frederic Weisbecker
Cc: Greg Thelen
Cc: Johannes Weiner
Cc: JoonSoo Kim
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Cc: Michal Hocko
Cc: Pekka Enberg
Cc: Rik van Riel
Cc: Suleiman Souhlal
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Glauber Costa
2012-12-19 07:02:14 +0800
510fc4e11 memcg: kmem accounting basic infrastructure ... Browse Code »

Add the basic infrastructure for the accounting of kernel memory. To
control that, the following files are created:

* memory.kmem.usage_in_bytes
* memory.kmem.limit_in_bytes
* memory.kmem.failcnt
* memory.kmem.max_usage_in_bytes

They have the same meaning of their user memory counterparts. They
reflect the state of the "kmem" res_counter.

Per cgroup kmem memory accounting is not enabled until a limit is set for
the group. Once the limit is set the accounting cannot be disabled for
that group. This means that after the patch is applied, no behavioral
changes exists for whoever is still using memcg to control their memory
usage, until memory.kmem.limit_in_bytes is set for the first time.

We always account to both user and kernel resource_counters. This
effectively means that an independent kernel limit is in place when the
limit is set to a lower value than the user memory. A equal or higher
value means that the user limit will always hit first, meaning that kmem
is effectively unlimited.

People who want to track kernel memory but not limit it, can set this
limit to a very high number (like RESOURCE_MAX - 1page - that no one will
ever hit, or equal to the user memory)

[akpm@linux-foundation.org: MEMCG_MMEM only works with slab and slub]
Signed-off-by: Glauber Costa
Acked-by: Kamezawa Hiroyuki
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc: Tejun Heo
Cc: Christoph Lameter
Cc: David Rientjes
Cc: Frederic Weisbecker
Cc: Greg Thelen
Cc: JoonSoo Kim
Cc: Mel Gorman
Cc: Pekka Enberg
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Glauber Costa
2012-12-19 07:02:12 +0800

18 Dec, 2012

1 commit

6a2b60b17 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace changes from Eric Biederman:
"While small this set of changes is very significant with respect to
containers in general and user namespaces in particular. The user
space interface is now complete.

This set of changes adds support for unprivileged users to create user
namespaces and as a user namespace root to create other namespaces.
The tyranny of supporting suid root preventing unprivileged users from
using cool new kernel features is broken.

This set of changes completes the work on setns, adding support for
the pid, user, mount namespaces.

This set of changes includes a bunch of basic pid namespace
cleanups/simplifications. Of particular significance is the rework of
the pid namespace cleanup so it no longer requires sending out
tendrils into all kinds of unexpected cleanup paths for operation. At
least one case of broken error handling is fixed by this cleanup.

The files under /proc//ns/ have been converted from regular files
to magic symlinks which prevents incorrect caching by the VFS,
ensuring the files always refer to the namespace the process is
currently using and ensuring that the ptrace_mayaccess permission
checks are always applied.

The files under /proc//ns/ have been given stable inode numbers
so it is now possible to see if different processes share the same
namespaces.

Through the David Miller's net tree are changes to relax many of the
permission checks in the networking stack to allowing the user
namespace root to usefully use the networking stack. Similar changes
for the mount namespace and the pid namespace are coming through my
tree.

Two small changes to add user namespace support were commited here adn
in David Miller's -net tree so that I could complete the work on the
/proc//ns/ files in this tree.

Work remains to make it safe to build user namespaces and 9p, afs,
ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the
Kconfig guard remains in place preventing that user namespaces from
being built when any of those filesystems are enabled.

Future design work remains to allow root users outside of the initial
user namespace to mount more than just /proc and /sys."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits)
proc: Usable inode numbers for the namespace file descriptors.
proc: Fix the namespace inode permission checks.
proc: Generalize proc inode allocation
userns: Allow unprivilged mounts of proc and sysfs
userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file
procfs: Print task uids and gids in the userns that opened the proc file
userns: Implement unshare of the user namespace
userns: Implent proc namespace operations
userns: Kill task_user_ns
userns: Make create_new_namespaces take a user_ns parameter
userns: Allow unprivileged use of setns.
userns: Allow unprivileged users to create new namespaces
userns: Allow setting a userns mapping to your current uid.
userns: Allow chown and setgid preservation
userns: Allow unprivileged users to create user namespaces.
userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped
userns: fix return value on mntns_install() failure
vfs: Allow unprivileged manipulation of the mount namespace.
vfs: Only support slave subtrees across different user namespaces
vfs: Add a user namespace reference from struct mnt_namespace
...

Linus Torvalds
2012-12-18 07:44:47 +0800

17 Dec, 2012

1 commit

3d59eebc5 Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma ... Browse Code »

Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
"There are three implementations for NUMA balancing, this tree
(balancenuma), numacore which has been developed in tip/master and
autonuma which is in aa.git.

In almost all respects balancenuma is the dumbest of the three because
its main impact is on the VM side with no attempt to be smart about
scheduling. In the interest of getting the ball rolling, it would be
desirable to see this much merged for 3.8 with the view to building
scheduler smarts on top and adapting the VM where required for 3.9.

The most recent set of comparisons available from different people are

mel: https://lkml.org/lkml/2012/12/9/108
mingo: https://lkml.org/lkml/2012/12/7/331
tglx: https://lkml.org/lkml/2012/12/10/437
srikar: https://lkml.org/lkml/2012/12/10/397

The results are a mixed bag. In my own tests, balancenuma does
reasonably well. It's dumb as rocks and does not regress against
mainline. On the other hand, Ingo's tests shows that balancenuma is
incapable of converging for this workloads driven by perf which is bad
but is potentially explained by the lack of scheduler smarts. Thomas'
results show balancenuma improves on mainline but falls far short of
numacore or autonuma. Srikar's results indicate we all suffer on a
large machine with imbalanced node sizes.

My own testing showed that recent numacore results have improved
dramatically, particularly in the last week but not universally.
We've butted heads heavily on system CPU usage and high levels of
migration even when it shows that overall performance is better.
There are also cases where it regresses. Of interest is that for
specjbb in some configurations it will regress for lower numbers of
warehouses and show gains for higher numbers which is not reported by
the tool by default and sometimes missed in treports. Recently I
reported for numacore that the JVM was crashing with
NullPointerExceptions but currently it's unclear what the source of
this problem is. Initially I thought it was in how numacore batch
handles PTEs but I'm no longer think this is the case. It's possible
numacore is just able to trigger it due to higher rates of migration.

These reports were quite late in the cycle so I/we would like to start
with this tree as it contains much of the code we can agree on and has
not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
mm/rmap: Convert the struct anon_vma::mutex to an rwsem
mm: migrate: Account a transhuge page properly when rate limiting
mm: numa: Account for failed allocations and isolations as migration failures
mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
mm: numa: Add THP migration for the NUMA working set scanning fault case.
mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
mm: sched: numa: Control enabling and disabling of NUMA balancing
mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely tasknode relationships
mm: numa: migrate: Set last_nid on newly allocated page
mm: numa: split_huge_page: Transfer last_nid on tail page
mm: numa: Introduce last_nid to the page frame
sched: numa: Slowly increase the scanning period as NUMA faults are handled
mm: numa: Rate limit setting of pte_numa if node is saturated
mm: numa: Rate limit the amount of memory that is migrated between nodes
mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
mm: numa: Migrate pages handled during a pmd_numa hinting fault
mm: numa: Migrate on reference policy
...

Linus Torvalds
2012-12-17 07:18:08 +0800

11 Dec, 2012

2 commits

1a687c2e9 mm: sched: numa: Control enabling and disabling of NUMA balancing ... Browse Code »

This patch adds Kconfig options and kernel parameters to allow the
enabling and disabling of automatic NUMA balancing. The existance
of such a switch was and is very important when debugging problems
related to transparent hugepages and we should have the same for
automatic NUMA placement.

Signed-off-by: Mel Gorman

Mel Gorman
2012-12-11 22:42:55 +0800
be3a72842 mm: numa: pte_numa() and pmd_numa() ... Browse Code »

Implement pte_numa and pmd_numa.

We must atomically set the numa bit and clear the present bit to
define a pte_numa or pmd_numa.

Once a pte or pmd has been set as pte_numa or pmd_numa, the next time
a thread touches a virtual address in the corresponding virtual range,
a NUMA hinting page fault will trigger. The NUMA hinting page fault
will clear the NUMA bit and set the present bit again to resolve the
page fault.

The expectation is that a NUMA hinting page fault is used as part
of a placement policy that decides if a page should remain on the
current node or migrated to a different node.

Acked-by: Rik van Riel
Signed-off-by: Andrea Arcangeli
Signed-off-by: Mel Gorman

Andrea Arcangeli
2012-12-11 22:42:36 +0800