Eric Lee / smarc-fsl-linux-kernel

19 Dec, 2013

1 commit

c97102ba9 kexec: migrate to reboot cpu ... Browse Code »

Commit 1b3a5d02ee07 ("reboot: move arch/x86 reboot= handling to generic
kernel") moved reboot= handling to generic code. In the process it also
removed the code in native_machine_shutdown() which are moving reboot
process to reboot_cpu/cpu0.

I guess that thought must have been that all reboot paths are calling
migrate_to_reboot_cpu(), so we don't need this special handling. But
kexec reboot path (kernel_kexec()) is not calling
migrate_to_reboot_cpu() so above change broke kexec. Now reboot can
happen on non-boot cpu and when INIT is sent in second kerneo to bring
up BP, it brings down the machine.

So start calling migrate_to_reboot_cpu() in kexec reboot path to avoid
this problem.

Bisected by WANG Chao.

Reported-by: Matthew Whitehead
Reported-by: Dave Young
Signed-off-by: Vivek Goyal
Tested-by: Baoquan He
Tested-by: WANG Chao
Acked-by: H. Peter Anvin
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vivek Goyal
2013-12-19 11:04:50 +0800

18 Dec, 2013

1 commit

dd0508093 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fixes from Ingo Molnar:
"Three fixes for scheduler crashes, each triggers in relatively rare,
hardware environment dependent situations"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Rework sched_fair time accounting
math64: Add mul_u64_u32_shr()
sched: Remove PREEMPT_NEED_RESCHED from generic code
sched: Initialize power_orig for overlapping groups

Linus Torvalds
2013-12-18 04:35:54 +0800

16 Dec, 2013

1 commit

9199c4caa Merge tag 'pci-v3.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci ... Browse Code »

Pull PCI updates from Bjorn Helgaas:
"PCI device hotplug
- Move device_del() from pci_stop_dev() to pci_destroy_dev() (Rafael
Wysocki)

Host bridge drivers
- Update maintainers for DesignWare, i.MX6, Armada, R-Car (Bjorn
Helgaas)
- mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin
(Jason Gunthorpe)

Miscellaneous
- Avoid unnecessary CPU switch when calling .probe() (Alexander
Duyck)
- Revert "workqueue: allow work_on_cpu() to be called recursively"
(Bjorn Helgaas)
- Disable Bus Master only on kexec reboot (Khalid Aziz)
- Omit PCI ID macro strings to shorten quirk names for LTO (Michal
Marek)"

* tag 'pci-v3.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
MAINTAINERS: Add DesignWare, i.MX6, Armada, R-Car PCI host maintainers
PCI: Disable Bus Master only on kexec reboot
PCI: mvebu: Return 'unsupported' for Interrupt Line and Interrupt Pin
PCI: Omit PCI ID macro strings to shorten quirk names
PCI: Move device_del() from pci_stop_dev() to pci_destroy_dev()
Revert "workqueue: allow work_on_cpu() to be called recursively"
PCI: Avoid unnecessary CPU switch when calling driver .probe() method

Linus Torvalds
2013-12-16 03:45:27 +0800

13 Dec, 2013

3 commits

5dec682c7 Merge tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs ... Browse Code »

Pull misc keyrings fixes from David Howells:
"These break down into five sets:

- A patch to error handling in the big_key type for huge payloads.
If the payload is larger than the "low limit" and the backing store
allocation fails, then big_key_instantiate() doesn't clear the
payload pointers in the key, assuming them to have been previously
cleared - but only one of them is.

Unfortunately, the garbage collector still calls big_key_destroy()
when sees one of the pointers with a weird value in it (and not
NULL) which it then tries to clean up.

- Three patches to fix the keyring type:

* A patch to fix the hash function to correctly divide keyrings off
from keys in the topology of the tree inside the associative
array. This is only a problem if searching through nested
keyrings - and only if the hash function incorrectly puts the a
keyring outside of the 0 branch of the root node.

* A patch to fix keyrings' use of the associative array. The
__key_link_begin() function initially passes a NULL key pointer
to assoc_array_insert() on the basis that it's holding a place in
the tree whilst it does more allocation and stuff.

This is only a problem when a node contains 16 keys that match at
that level and we want to add an also matching 17th. This should
easily be manufactured with a keyring full of keyrings (without
chucking any other sort of key into the mix) - except for (a)
above which makes it on average adding the 65th keyring.

* A patch to fix searching down through nested keyrings, where any
keyring in the set has more than 16 keyrings and none of the
first keyrings we look through has a match (before the tree
iteration needs to step to a more distal node).

Test in keyutils test suite:

http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=8b4ae963ed92523aea18dfbb8cab3f4979e13bd1

- A patch to fix the big_key type's use of a shmem file as its
backing store causing audit messages and LSM check failures. This
is done by setting S_PRIVATE on the file to avoid LSM checks on the
file (access to the shmem file goes through the keyctl() interface
and so is gated by the LSM that way).

This isn't normally a problem if a key is used by the context that
generated it - and it's currently only used by libkrb5.

Test in keyutils test suite:

http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=d9a53cbab42c293962f2f78f7190253fc73bd32e

- A patch to add a generated file to .gitignore.

- A patch to fix the alignment of the system certificate data such
that it it works on s390. As I understand it, on the S390 arch,
symbols must be 2-byte aligned because loading the address discards
the least-significant bit"

* tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
KEYS: correct alignment of system_certificate_list content in assembly file
Ignore generated file kernel/x509_certificate_list
security: shmem: implement kernel private shmem inodes
KEYS: Fix searching of nested keyrings
KEYS: Fix multiple key add into associative array
KEYS: Fix the keyring hash function
KEYS: Pre-clear struct key on allocation

Linus Torvalds
2013-12-13 02:15:24 +0800
5cdec2d83 futex: move user address verification up to common code ... Browse Code »

When debugging the read-only hugepage case, I was confused by the fact
that get_futex_key() did an access_ok() only for the non-shared futex
case, since the user address checking really isn't in any way specific
to the private key handling.

Now, it turns out that the shared key handling does effectively do the
equivalent checks inside get_user_pages_fast() (it doesn't actually
check the address range on x86, but does check the page protections for
being a user page). So it wasn't actually a bug, but the fact that we
treat the address differently for private and shared futexes threw me
for a loop.

Just move the check up, so that it gets done for both cases. Also, use
the 'rw' parameter for the type, even if it doesn't actually matter any
more (it's a historical artifact of the old racy i386 "page faults from
kernel space don't check write protections").

Cc: Thomas Gleixner
Signed-off-by: Linus Torvalds

Linus Torvalds
2013-12-13 01:53:51 +0800
f12d5bfce futex: fix handling of read-only-mapped hugepages ... Browse Code »

The hugepage code had the exact same bug that regular pages had in
commit 7485d0d3758e ("futexes: Remove rw parameter from
get_futex_key()").

The regular page case was fixed by commit 9ea71503a8ed ("futex: Fix
regression with read only mappings"), but the transparent hugepage case
(added in a5b338f2b0b1: "thp: update futex compound knowledge") case
remained broken.

Found by Dave Jones and his trinity tool.

Reported-and-tested-by: Dave Jones
Cc: stable@kernel.org # v2.6.38+
Acked-by: Thomas Gleixner
Cc: Mel Gorman
Cc: Darren Hart
Cc: Andrea Arcangeli
Cc: Oleg Nesterov
Signed-off-by: Linus Torvalds

Linus Torvalds
2013-12-13 01:38:42 +0800

11 Dec, 2013

4 commits

9dbdb1555 sched/fair: Rework sched_fair time accounting ... Browse Code »

Christian suffers from a bad BIOS that wrecks his i5's TSC sync. This
results in him occasionally seeing time going backwards - which
crashes the scheduler ...

Most of our time accounting can actually handle that except the most
common one; the tick time update of sched_fair.

There is a further problem with that code; previously we assumed that
because we get a tick every TICK_NSEC our time delta could never
exceed 32bits and math was simpler.

However, ever since Frederic managed to get NO_HZ_FULL merged; this is
no longer the case since now a task can run for a long time indeed
without getting a tick. It only takes about ~4.2 seconds to overflow
our u32 in nanoseconds.

This means we not only need to better deal with time going backwards;
but also means we need to be able to deal with large deltas.

This patch reworks the entire code and uses mul_u64_u32_shr() as
proposed by Andy a long while ago.

We express our virtual time scale factor in a u32 multiplier and shift
right and the 32bit mul_u64_u32_shr() implementation reduces to a
single 32x32->64 multiply if the time delta is still short (common
case).

For 64bit a 64x64->128 multiply can be used if ARCH_SUPPORTS_INT128.

Reported-and-Tested-by: Christian Engelmayer
Signed-off-by: Peter Zijlstra
Cc: fweisbec@gmail.com
Cc: Paul Turner
Cc: Stanislaw Gruszka
Cc: Andy Lutomirski
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/20131118172706.GI3866@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-12-11 22:52:35 +0800
8e8339a3a sched: Initialize power_orig for overlapping groups ... Browse Code »

Yinghai reported that he saw a /0 in sg_capacity on his EX parts.
Make sure to always initialize power_orig now that we actually use it.

Ideally build_sched_domains() -> init_sched_groups_power() would also
initialize this; but for some yet unexplained reason some setups seem
to miss updates there.

Reported-by: Yinghai Lu
Tested-by: Yinghai Lu
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/n/tip-l8ng2m9uml6fhibln8wqpom7@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2013-12-11 22:47:59 +0800
62226983d KEYS: correct alignment of system_certificate_list content in assembly file ... Browse Code »

Apart from data-type specific alignment constraints, there are also
architecture-specific alignment requirements.
For example, on s390 symbols must be on even addresses implying a 2-byte
alignment. If the system_certificate_list_end symbol is on an odd address
and if this address is loaded, the least-significant bit is ignored. As a
result, the load_system_certificate_list() fails to load the certificates
because of a wrong certificate length calculation.

To be safe, align system_certificate_list on an 8-byte boundary. Also improve
the length calculation of the system_certificate_list content. Introduce a
system_certificate_list_size (8-byte aligned because of unsigned long) variable
that stores the length. Let the linker calculate this size by introducing
a start and end label for the certificate content.

Signed-off-by: Hendrik Brueckner
Signed-off-by: David Howells

Hendrik Brueckner
2013-12-11 02:25:28 +0800
7cfe5b331 Ignore generated file kernel/x509_certificate_list ... Browse Code »

$ git status
# On branch pending-rebases
# Untracked files:
# (use "git add ..." to include in what will be committed)
#
# kernel/x509_certificate_list
nothing added to commit but untracked files present (use "git add" to track)
$

Signed-off-by: Rusty Russell
Signed-off-by: David Howells

Rusty Russell
2013-12-11 02:21:34 +0800

08 Dec, 2013

1 commit

4fc9bbf98 PCI: Disable Bus Master only on kexec reboot ... Browse Code »

Add a flag to tell the PCI subsystem that kernel is shutting down in
preparation to kexec a kernel. Add code in PCI subsystem to use this flag
to clear Bus Master bit on PCI devices only in case of kexec reboot.

This fixes a power-off problem on Acer Aspire V5-573G and likely other
machines and avoids any other issues caused by clearing Bus Master bit on
PCI devices in normal shutdown path. The problem was introduced by
b566a22c2332 ("PCI: disable Bus Master on PCI device shutdown").

This patch is based on discussion at
http://marc.info/?l=linux-pci&m=138425645204355&w=2

Link: https://bugzilla.kernel.org/show_bug.cgi?id=63861
Reported-by: Chang Liu
Signed-off-by: Khalid Aziz
Signed-off-by: Bjorn Helgaas
Acked-by: Konstantin Khlebnikov
Cc: stable@vger.kernel.org # v3.5+

Khalid Aziz
2013-12-08 05:20:28 +0800

07 Dec, 2013

1 commit

843f4f4bb Merge tag 'trace-fixes-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
"A regression showed up that there's a large delay when enabling all
events. This was prevalent when FTRACE_SELFTEST was enabled which
enables all events several times, and caused the system bootup to
pause for over a minute.

This was tracked down to an addition of a synchronize_sched()
performed when system call tracepoints are unregistered.

The synchronize_sched() is needed between the unregistering of the
system call tracepoint and a deletion of a tracing instance buffer.
But placing the synchronize_sched() in the unreg of *every* system
call tracepoint is a bit overboard. A single synchronize_sched()
before the deletion of the instance is sufficient"

* tag 'trace-fixes-3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Only run synchronize_sched() at instance deletion time

Linus Torvalds
2013-12-07 00:34:16 +0800

06 Dec, 2013

1 commit

3ccb01239 tracing: Only run synchronize_sched() at instance deletion time ... Browse Code »

It has been reported that boot up with FTRACE_SELFTEST enabled can take a
very long time. There can be stalls of over a minute.

This was tracked down to the synchronize_sched() called when a system call
event is disabled. As the self tests enable and disable thousands of events,
this makes the synchronize_sched() get called thousands of times.

The synchornize_sched() was added with d562aff93bfb53 "tracing: Add support
for SOFT_DISABLE to syscall events" which caused this regression (added
in 3.13-rc1).

The synchronize_sched() is to protect against the events being accessed
when a tracer instance is being deleted. When an instance is being deleted
all the events associated to it are unregistered. The synchronize_sched()
makes sure that no more users are running when it finishes.

Instead of calling synchronize_sched() for all syscall events, we only
need to call it once, after the events are unregistered and before the
instance is deleted. The event_mutex is held during this action to
prevent new users from enabling events.

Link: http://lkml.kernel.org/r/20131203124120.427b9661@gandalf.local.home

Reported-by: Petr Mladek
Acked-by: Tom Zanussi
Acked-by: Petr Mladek
Tested-by: Petr Mladek
Signed-off-by: Steven Rostedt

Steven Rostedt
2013-12-06 03:22:30 +0800

05 Dec, 2013

1 commit

1ab231b27 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer fixes from Thomas Gleixner:

- timekeeping: Cure a subtle drift issue on GENERIC_TIME_VSYSCALL_OLD

- nohz: Make CONFIG_NO_HZ=n and nohz=off command line option behave the
same way. Fixes a long standing load accounting wreckage.

- clocksource/ARM: Kconfig update to avoid ARM=n wreckage

- clocksource/ARM: Fixlets for the AT91 and SH clocksource/clockevents

- Trivial documentation update and kzalloc conversion from akpms pile

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
nohz: Fix another inconsistency between CONFIG_NO_HZ=n and nohz=off
time: Fix 1ns/tick drift w/ GENERIC_TIME_VSYSCALL_OLD
clocksource: arm_arch_timer: Hide eventstream Kconfig on non-ARM
clocksource: sh_tmu: Add clk_prepare/unprepare support
clocksource: sh_tmu: Release clock when sh_tmu_register() fails
clocksource: sh_mtu2: Add clk_prepare/unprepare support
clocksource: sh_mtu2: Release clock when sh_mtu2_register() fails
ARM: at91: rm9200: switch back to clockevents_config_and_register
tick: Document tick_do_timer_cpu
timer: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)
NOHZ: Check for nohz active instead of nohz enabled

Linus Torvalds
2013-12-05 00:52:09 +0800

03 Dec, 2013

3 commits

a45299e72 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq fixes from Thomas Gleixner:
- Correction of fuzzy and fragile IRQ_RETVAL macro
- IRQ related resume fix affecting only XEN
- ARM/GIC fix for chained GIC controllers

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: Gic: fix boot for chained gics
irq: Enable all irqs unconditionally in irq_resume
genirq: Correct fuzzy and fragile IRQ_RETVAL() definition

Linus Torvalds
2013-12-03 02:15:39 +0800
a0b57ca33 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fixes from Ingo Molnar:
"Various smaller fixlets, all over the place"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/doc: Fix generation of device-drivers
sched: Expose preempt_schedule_irq()
sched: Fix a trivial typo in comments
sched: Remove unused variable in 'struct sched_domain'
sched: Avoid NULL dereference on sd_busy
sched: Check sched_domain before computing group power
MAINTAINERS: Update file patterns in the lockdep and scheduler entries

Linus Torvalds
2013-12-03 02:13:44 +0800
e321ae4c2 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"Misc kernel and tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools lib traceevent: Fix conversion of pointer to integer of different size
perf/trace: Properly use u64 to hold event_id
perf: Remove fragile swevent hlist optimization
ftrace, perf: Avoid infinite event generation loop
tools lib traceevent: Fix use of multiple options in processing field
perf header: Fix possible memory leaks in process_group_desc()
perf header: Fix bogus group name
perf tools: Tag thread comm as overriden

Linus Torvalds
2013-12-03 02:13:09 +0800

30 Nov, 2013

2 commits

7224b31bd Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

Pull workqueue fixes from Tejun Heo:
"This contains one important fix. The NUMA support added a while back
broke ordering guarantees on ordered workqueues. It was enforced by
having single frontend interface with @max_active == 1 but the NUMA
support puts multiple interfaces on unbound workqueues on NUMA
machines thus breaking the ordered guarantee. This is fixed by
disabling NUMA support on ordered workqueues.

The above and a couple other patches were sitting in for-3.12-fixes
but I forgot to push that out, so they ended up waiting a bit too
long. My aplogies.

Other fixes are minor"

* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues
workqueue: fix comment typo for __queue_work()
workqueue: fix ordered workqueues in NUMA setups
workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY

Linus Torvalds
2013-11-30 01:49:08 +0800
2855987d1 Merge branch 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup fixes from Tejun Heo:
"Fixes for three issues.

- cgroup destruction path could swamp system_wq possibly leading to
deadlock. This actually seems to happen in the wild with memcg
because memcg destruction path adds nested dependency on system_wq.

Resolved by isolating cgroup destruction work items on its
dedicated workqueue.

- Possible locking context deadlock through seqcount reported by
lockdep

- Memory leak under certain conditions"

* 'for-3.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix cgroup_subsys_state leak for seq_files
cpuset: Fix memory allocator deadlock
cgroup: use a dedicated workqueue for cgroup destruction

Linus Torvalds
2013-11-30 01:47:06 +0800

29 Nov, 2013

2 commits

0e576acbc nohz: Fix another inconsistency between CONFIG_NO_HZ=n and nohz=off ... Browse Code »

If CONFIG_NO_HZ=n tick_nohz_get_sleep_length() returns NSEC_PER_SEC/HZ.

If CONFIG_NO_HZ=y and the nohz functionality is disabled via the
command line option "nohz=off" or not enabled due to missing hardware
support, then tick_nohz_get_sleep_length() returns 0. That happens
because ts->sleep_length is never set in that case.

Set it to NSEC_PER_SEC/HZ when the NOHZ mode is inactive.

Reported-by: Michal Hocko
Reported-by: Borislav Petkov
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2013-11-29 19:23:03 +0800
5ecbe3c3c kernel/extable: fix address-checks for core_kernel and init areas ... Browse Code »

The init_kernel_text() and core_kernel_text() functions should not
include the labels _einittext and _etext when checking if an address is
inside the .text or .init sections.

Signed-off-by: Helge Deller
Signed-off-by: Linus Torvalds

Helge Deller
2013-11-29 01:49:41 +0800

28 Nov, 2013

2 commits

e605b3657 cgroup: fix cgroup_subsys_state leak for seq_files ... Browse Code »

If a cgroup file implements either read_map() or read_seq_string(),
such file is served using seq_file by overriding file->f_op to
cgroup_seqfile_operations, which also overrides the release method to
single_release() from cgroup_file_release().

Because cgroup_file_open() didn't use to acquire any resources, this
used to be fine, but since f7d58818ba42 ("cgroup: pin
cgroup_subsys_state when opening a cgroupfs file"), cgroup_file_open()
pins the css (cgroup_subsys_state) which is put by
cgroup_file_release(). The patch forgot to update the release path
for seq_files and each open/release cycle leaks a css reference.

Fix it by updating cgroup_file_release() to also handle seq_files and
using it for seq_file release path too.

Signed-off-by: Tejun Heo
Cc: stable@vger.kernel.org # v3.12

Tejun Heo
2013-11-28 07:16:21 +0800
0fc0287c9 cpuset: Fix memory allocator deadlock ... Browse Code »

Juri hit the below lockdep report:

[ 4.303391] ======================================================
[ 4.303392] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[ 4.303394] 3.12.0-dl-peterz+ #144 Not tainted
[ 4.303395] ------------------------------------------------------
[ 4.303397] kworker/u4:3/689 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 4.303399] (&p->mems_allowed_seq){+.+...}, at: [] new_slab+0x6c/0x290
[ 4.303417]
[ 4.303417] and this task is already holding:
[ 4.303418] (&(&q->__queue_lock)->rlock){..-...}, at: [] blk_execute_rq_nowait+0x5b/0x100
[ 4.303431] which would create a new lock dependency:
[ 4.303432] (&(&q->__queue_lock)->rlock){..-...} -> (&p->mems_allowed_seq){+.+...}
[ 4.303436]

[ 4.303898] the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[ 4.303918] -> (&p->mems_allowed_seq){+.+...} ops: 2762 {
[ 4.303922] HARDIRQ-ON-W at:
[ 4.303923] [] __lock_acquire+0x65a/0x1ff0
[ 4.303926] [] lock_acquire+0x93/0x140
[ 4.303929] [] kthreadd+0x86/0x180
[ 4.303931] [] ret_from_fork+0x7c/0xb0
[ 4.303933] SOFTIRQ-ON-W at:
[ 4.303933] [] __lock_acquire+0x68c/0x1ff0
[ 4.303935] [] lock_acquire+0x93/0x140
[ 4.303940] [] kthreadd+0x86/0x180
[ 4.303955] [] ret_from_fork+0x7c/0xb0
[ 4.303959] INITIAL USE at:
[ 4.303960] [] __lock_acquire+0x344/0x1ff0
[ 4.303963] [] lock_acquire+0x93/0x140
[ 4.303966] [] kthreadd+0x86/0x180
[ 4.303969] [] ret_from_fork+0x7c/0xb0
[ 4.303972] }

Which reports that we take mems_allowed_seq with interrupts enabled. A
little digging found that this can only be from
cpuset_change_task_nodemask().

This is an actual deadlock because an interrupt doing an allocation will
hit get_mems_allowed()->...->__read_seqcount_begin(), which will spin
forever waiting for the write side to complete.

Cc: John Stultz
Cc: Mel Gorman
Reported-by: Juri Lelli
Signed-off-by: Peter Zijlstra
Tested-by: Juri Lelli
Acked-by: Li Zefan
Acked-by: Mel Gorman
Signed-off-by: Tejun Heo
Cc: stable@vger.kernel.org

Peter Zijlstra
2013-11-28 02:52:47 +0800

27 Nov, 2013

2 commits

32e475d76 sched: Expose preempt_schedule_irq() ... Browse Code »

Tony reported that aa0d53260596 ("ia64: Use preempt_schedule_irq")
broke PREEMPT=n builds on ia64.

Ok, wrapped my brain around it. I tripped over the magic asm foo which
has a single need_resched check and schedule point for both sys call
return and interrupt return.

So you need the schedule_preempt_irq() for kernel preemption from
interrupt return while on a normal syscall preemption a schedule would
be sufficient. But using schedule_preempt_irq() is not harmful here in
any way. It just sets the preempt_active bit also in cases where it
would not be required.

Even on preempt=n kernels adding the preempt_active bit is completely
harmless. So instead of having an extra function, moving the existing
one out of the ifdef PREEMPT looks like the sanest thing to do.

It would also allow getting rid of various other sti/schedule/cli asm
magic in other archs.

Reported-and-Tested-by: Tony Luck
Fixes: aa0d53260596 ("ia64: Use preempt_schedule_irq")
Signed-off-by: Thomas Gleixner
[slightly edited Changelog]
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311211230030.30673@ionos.tec.linutronix.de
Signed-off-by: Ingo Molnar

Thomas Gleixner
2013-11-27 18:04:53 +0800
8ae516aa8 Merge tag 'trace-fixes-v3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/g… ... Browse Code »

…it/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"This includes two fixes.

1) is a bug fix that happens when root does the following:

echo function_graph > current_tracer
modprobe foo
echo nop > current_tracer

This causes the ftrace internal accounting to get screwed up and
crashes ftrace, preventing the user from using the function tracer
after that.

2) if a TRACE_EVENT has a string field, and NULL is given for it.

The internal trace event code does a strlen() and strcpy() on the
source of field. If it is NULL it causes the system to oops.

This bug has been there since 2.6.31, but no TRACE_EVENT ever passed
in a NULL to the string field, until now"

* tag 'trace-fixes-v3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Fix function graph with loading of modules
tracing: Allow events to have NULL strings

Linus Torvalds
2013-11-27 10:04:21 +0800

26 Nov, 2013

3 commits

8a56d7761 ftrace: Fix function graph with loading of modules ... Browse Code »

Commit 8c4f3c3fa9681 "ftrace: Check module functions being traced on reload"
fixed module loading and unloading with respect to function tracing, but
it missed the function graph tracer. If you perform the following

# cd /sys/kernel/debug/tracing
# echo function_graph > current_tracer
# modprobe nfsd
# echo nop > current_tracer

You'll get the following oops message:

------------[ cut here ]------------
WARNING: CPU: 2 PID: 2910 at /linux.git/kernel/trace/ftrace.c:1640 __ftrace_hash_rec_update.part.35+0x168/0x1b9()
Modules linked in: nfsd exportfs nfs_acl lockd ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables uinput snd_hda_codec_idt
CPU: 2 PID: 2910 Comm: bash Not tainted 3.13.0-rc1-test #7
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
0000000000000668 ffff8800787efcf8 ffffffff814fe193 ffff88007d500000
0000000000000000 ffff8800787efd38 ffffffff8103b80a 0000000000000668
ffffffff810b2b9a ffffffff81a48370 0000000000000001 ffff880037aea000
Call Trace:
[] dump_stack+0x4f/0x7c
[] warn_slowpath_common+0x81/0x9b
[] ? __ftrace_hash_rec_update.part.35+0x168/0x1b9
[] warn_slowpath_null+0x1a/0x1c
[] __ftrace_hash_rec_update.part.35+0x168/0x1b9
[] ? __mutex_lock_slowpath+0x364/0x364
[] ftrace_shutdown+0xd7/0x12b
[] unregister_ftrace_graph+0x49/0x78
[] graph_trace_reset+0xe/0x10
[] tracing_set_tracer+0xa7/0x26a
[] tracing_set_trace_write+0x8b/0xbd
[] ? ftrace_return_to_handler+0xb2/0xde
[] ? __sb_end_write+0x5e/0x5e
[] vfs_write+0xab/0xf6
[] ftrace_graph_caller+0x85/0x85
[] SyS_write+0x59/0x82
[] ftrace_graph_caller+0x85/0x85
[] system_call_fastpath+0x16/0x1b
---[ end trace 940358030751eafb ]---

The above mentioned commit didn't go far enough. Well, it covered the
function tracer by adding checks in __register_ftrace_function(). The
problem is that the function graph tracer circumvents that (for a slight
efficiency gain when function graph trace is running with a function
tracer. The gain was not worth this).

The problem came with ftrace_startup() which should always be called after
__register_ftrace_function(), if you want this bug to be completely fixed.

Anyway, this solution moves __register_ftrace_function() inside of
ftrace_startup() and removes the need to call them both.

Reported-by: Dave Wysochanski
Fixes: ed926f9b35cd ("ftrace: Use counters to enable functions to trace")
Cc: stable@vger.kernel.org # 3.0+
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-11-26 23:36:50 +0800
12997d1a9 Revert "workqueue: allow work_on_cpu() to be called recursively" ... Browse Code »

This reverts commit c2fda509667b0fda4372a237f5a59ea4570b1627.

c2fda509667b removed lockdep annotation from work_on_cpu() to work around
the PCI path that calls work_on_cpu() from within a work_on_cpu() work item
(PF driver .probe() method -> pci_enable_sriov() -> add VFs -> VF driver
.probe method).

961da7fb6b22 ("PCI: Avoid unnecessary CPU switch when calling driver
.probe() method) avoids that recursive work_on_cpu() use in a different
way, so this revert restores the work_on_cpu() lockdep annotation.

Signed-off-by: Bjorn Helgaas
Acked-by: Tejun Heo

Bjorn Helgaas
2013-11-26 05:37:22 +0800
ac01810c9 irq: Enable all irqs unconditionally in irq_resume ... Browse Code »

When the system enters suspend, it disables all interrupts in
suspend_device_irqs(), including the interrupts marked EARLY_RESUME.

On the resume side things are different. The EARLY_RESUME interrupts
are reenabled in sys_core_ops->resume and the non EARLY_RESUME
interrupts are reenabled in the normal system resume path.

When suspend_noirq() failed or suspend is aborted for any other
reason, we might omit the resume side call to sys_core_ops->resume()
and therefor the interrupts marked EARLY_RESUME are not reenabled and
stay disabled forever.

To solve this, enable all irqs unconditionally in irq_resume()
regardless whether interrupts marked EARLY_RESUMEhave been already
enabled or not.

This might try to reenable already enabled interrupts in the non
failure case, but the only affected platform is XEN and it has been
confirmed that it does not cause any side effects.

[ tglx: Massaged changelog. ]

Signed-off-by: Laxman Dewangan
Acked-by-and-tested-by: Konrad Rzeszutek Wilk
Acked-by: Heiko Stuebner
Reviewed-by: Pavel Machek
Cc:
Cc:
Cc:
Cc:
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1385388587-16442-1-git-send-email-ldewangan@nvidia.com
Signed-off-by: Thomas Gleixner

Laxman Dewangan
2013-11-26 05:20:02 +0800

24 Nov, 2013

1 commit

26b265cd2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 ... Browse Code »

Pull crypto update from Herbert Xu:
- Made x86 ablk_helper generic for ARM
- Phase out chainiv in favour of eseqiv (affects IPsec)
- Fixed aes-cbc IV corruption on s390
- Added constant-time crypto_memneq which replaces memcmp
- Fixed aes-ctr in omap-aes
- Added OMAP3 ROM RNG support
- Add PRNG support for MSM SoC's
- Add and use Job Ring API in caam
- Misc fixes

[ NOTE! This pull request was sent within the merge window, but Herbert
has some questionable email sending setup that makes him public enemy
#1 as far as gmail is concerned. So most of his emails seem to be
trapped by gmail as spam, resulting in me not seeing them. - Linus ]

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (49 commits)
crypto: s390 - Fix aes-cbc IV corruption
crypto: omap-aes - Fix CTR mode counter length
crypto: omap-sham - Add missing modalias
padata: make the sequence counter an atomic_t
crypto: caam - Modify the interface layers to use JR API's
crypto: caam - Add API's to allocate/free Job Rings
crypto: caam - Add Platform driver for Job Ring
hwrng: msm - Add PRNG support for MSM SoC's
ARM: DT: msm: Add Qualcomm's PRNG driver binding document
crypto: skcipher - Use eseqiv even on UP machines
crypto: talitos - Simplify key parsing
crypto: picoxcell - Simplify and harden key parsing
crypto: ixp4xx - Simplify and harden key parsing
crypto: authencesn - Simplify key parsing
crypto: authenc - Export key parsing helper function
crypto: mv_cesa: remove deprecated IRQF_DISABLED
hwrng: OMAP3 ROM Random Number Generator support
crypto: sha256_ssse3 - also test for BMI2
crypto: mv_cesa - Remove redundant of_match_ptr
crypto: sahara - Remove redundant of_match_ptr
...

Linus Torvalds
2013-11-24 08:18:25 +0800

23 Nov, 2013

6 commits

4e8b22bd1 workqueue: fix pool ID allocation leakage and remove BUILD_BUG_ON() in init_workqueues ... Browse Code »

When one work starts execution, the high bits of work's data contain
pool ID. It can represent a maximum of WORK_OFFQ_POOL_NONE. Pool ID
is assigned WORK_OFFQ_POOL_NONE when the work being initialized
indicating that no pool is associated and get_work_pool() uses it to
check the associated pool. So if worker_pool_assign_id() assigns a
ID greater than or equal WORK_OFFQ_POOL_NONE to a pool, it triggers
leakage, and it may break the non-reentrance guarantee.

This patch fix this issue by modifying the worker_pool_assign_id()
function calling idr_alloc() by setting @end param WORK_OFFQ_POOL_NONE.

Furthermore, in the current implementation, the BUILD_BUG_ON() in
init_workqueues makes no sense. The number of worker pools needed
cannot be determined at compile time, because the number of backing
pools for UNBOUND workqueues is dynamic based on the assigned custom
attributes. So remove it.

tj: Minor comment and indentation updates.

Signed-off-by: Li Bin
Signed-off-by: Tejun Heo

Li Bin
2013-11-23 07:14:47 +0800
9ef28a73f workqueue: fix comment typo for __queue_work() ... Browse Code »

It seems the "dying" should be "draining" here.

Signed-off-by: Li Bin
Signed-off-by: Tejun Heo

Li Bin
2013-11-23 07:14:27 +0800
8a2b75384 workqueue: fix ordered workqueues in NUMA setups ... Browse Code »

An ordered workqueue implements execution ordering by using single
pool_workqueue with max_active == 1. On a given pool_workqueue, work
items are processed in FIFO order and limiting max_active to 1
enforces the queued work items to be processed one by one.

Unfortunately, 4c16bd327c ("workqueue: implement NUMA affinity for
unbound workqueues") accidentally broke this guarantee by applying
NUMA affinity to ordered workqueues too. On NUMA setups, an ordered
workqueue would end up with separate pool_workqueues for different
nodes. Each pool_workqueue still limits max_active to 1 but multiple
work items may be executed concurrently and out of order depending on
which node they are queued to.

Fix it by using dedicated ordered_wq_attrs[] when creating ordered
workqueues. The new attrs match the unbound ones except that no_numa
is always set thus forcing all NUMA nodes to share the default
pool_workqueue.

While at it, add sanity check in workqueue creation path which
verifies that an ordered workqueues has only the default
pool_workqueue.

Signed-off-by: Tejun Heo
Reported-by: Libin
Cc: stable@vger.kernel.org
Cc: Lai Jiangshan

Tejun Heo
2013-11-23 07:14:02 +0800
911512280 workqueue: swap set_cpus_allowed_ptr() and PF_NO_SETAFFINITY ... Browse Code »

Move the setting of PF_NO_SETAFFINITY up before set_cpus_allowed()
in create_worker(). Otherwise userland can change ->cpus_allowed
in between.

Signed-off-by: Oleg Nesterov
Signed-off-by: Tejun Heo

Oleg Nesterov
2013-11-23 07:13:20 +0800
e5fca243a cgroup: use a dedicated workqueue for cgroup destruction ... Browse Code »

Since be44562613851 ("cgroup: remove synchronize_rcu() from
cgroup_diput()"), cgroup destruction path makes use of workqueue. css
freeing is performed from a work item from that point on and a later
commit, ea15f8ccdb430 ("cgroup: split cgroup destruction into two
steps"), moves css offlining to workqueue too.

As cgroup destruction isn't depended upon for memory reclaim, the
destruction work items were put on the system_wq; unfortunately, some
controller may block in the destruction path for considerable duration
while holding cgroup_mutex. As large part of destruction path is
synchronized through cgroup_mutex, when combined with high rate of
cgroup removals, this has potential to fill up system_wq's max_active
of 256.

Also, it turns out that memcg's css destruction path ends up queueing
and waiting for work items on system_wq through work_on_cpu(). If
such operation happens while system_wq is fully occupied by cgroup
destruction work items, work_on_cpu() can't make forward progress
because system_wq is full and other destruction work items on
system_wq can't make forward progress because the work item waiting
for work_on_cpu() is holding cgroup_mutex, leading to deadlock.

This can be fixed by queueing destruction work items on a separate
workqueue. This patch creates a dedicated workqueue -
cgroup_destroy_wq - for this purpose. As these work items shouldn't
have inter-dependencies and mostly serialized by cgroup_mutex anyway,
giving high concurrency level doesn't buy anything and the workqueue's
@max_active is set to 1 so that destruction work items are executed
one by one on each CPU.

Hugh Dickins: Because cgroup_init() is run before init_workqueues(),
cgroup_destroy_wq can't be allocated from cgroup_init(). Do it from a
separate core_initcall(). In the future, we probably want to reorder
so that workqueue init happens before cgroup_init().

Signed-off-by: Tejun Heo
Reported-by: Hugh Dickins
Reported-by: Shawn Bohrer
Link: http://lkml.kernel.org/r/20131111220626.GA7509@sbohrermbp13-local.rgmadvisors.com
Link: http://lkml.kernel.org/g/alpine.LNX.2.00.1310301606080.2333@eggly.anvils
Cc: stable@vger.kernel.org # v3.9+

Tejun Heo
2013-11-23 06:14:39 +0800
4be77398a time: Fix 1ns/tick drift w/ GENERIC_TIME_VSYSCALL_OLD ... Browse Code »

Since commit 1e75fa8be9f (time: Condense timekeeper.xtime
into xtime_sec - merged in v3.6), there has been an problem
with the error accounting in the timekeeping code, such that
when truncating to nanoseconds, we round up to the next nsec,
but the balancing adjustment to the ntp_error value was dropped.

This causes 1ns per tick drift forward of the clock.

In 3.7, this logic was isolated to only GENERIC_TIME_VSYSCALL_OLD
architectures (s390, ia64, powerpc).

The fix is simply to balance the accounting and to subtract the
added nanosecond from ntp_error. This allows the internal long-term
clock steering to keep the clock accurate.

While this fix removes the regression added in 1e75fa8be9f, the
ideal solution is to move away from GENERIC_TIME_VSYSCALL_OLD
and use the new VSYSCALL method, which avoids entirely the
nanosecond granular rounding, and the resulting short-term clock
adjustment oscillation needed to keep long term accurate time.

[ jstultz: Many thanks to Martin for his efforts identifying this
subtle bug, and providing the fix. ]

Originally-from: Martin Schwidefsky
Cc: Tony Luck
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: Andy Lutomirski
Cc: Paul Turner
Cc: Steven Rostedt
Cc: Richard Cochran
Cc: Prarit Bhargava
Cc: Fenghua Yu
Cc: Thomas Gleixner
Cc: stable #v3.6+
Link: http://lkml.kernel.org/r/1385149491-20307-1-git-send-email-john.stultz@linaro.org
Signed-off-by: John Stultz
Signed-off-by: Thomas Gleixner

Martin Schwidefsky
2013-11-23 04:08:11 +0800

22 Nov, 2013

2 commits

78dc53c42 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security subsystem updates from James Morris:
"In this patchset, we finally get an SELinux update, with Paul Moore
taking over as maintainer of that code.

Also a significant update for the Keys subsystem, as well as
maintenance updates to Smack, IMA, TPM, and Apparmor"

and since I wanted to know more about the updates to key handling,
here's the explanation from David Howells on that:

"Okay. There are a number of separate bits. I'll go over the big bits
and the odd important other bit, most of the smaller bits are just
fixes and cleanups. If you want the small bits accounting for, I can
do that too.

(1) Keyring capacity expansion.

KEYS: Consolidate the concept of an 'index key' for key access
KEYS: Introduce a search context structure
KEYS: Search for auth-key by name rather than target key ID
Add a generic associative array implementation.
KEYS: Expand the capacity of a keyring

Several of the patches are providing an expansion of the capacity of a
keyring. Currently, the maximum size of a keyring payload is one page.
Subtract a small header and then divide up into pointers, that only gives
you ~500 pointers on an x86_64 box. However, since the NFS idmapper uses
a keyring to store ID mapping data, that has proven to be insufficient to
the cause.

Whatever data structure I use to handle the keyring payload, it can only
store pointers to keys, not the keys themselves because several keyrings
may point to a single key. This precludes inserting, say, and rb_node
struct into the key struct for this purpose.

I could make an rbtree of records such that each record has an rb_node
and a key pointer, but that would use four words of space per key stored
in the keyring. It would, however, be able to use much existing code.

I selected instead a non-rebalancing radix-tree type approach as that
could have a better space-used/key-pointer ratio. I could have used the
radix tree implementation that we already have and insert keys into it by
their serial numbers, but that means any sort of search must iterate over
the whole radix tree. Further, its nodes are a bit on the capacious side
for what I want - especially given that key serial numbers are randomly
allocated, thus leaving a lot of empty space in the tree.

So what I have is an associative array that internally is a radix-tree
with 16 pointers per node where the index key is constructed from the key
type pointer and the key description. This means that an exact lookup by
type+description is very fast as this tells us how to navigate directly to
the target key.

I made the data structure general in lib/assoc_array.c as far as it is
concerned, its index key is just a sequence of bits that leads to a
pointer. It's possible that someone else will be able to make use of it
also. FS-Cache might, for example.

(2) Mark keys as 'trusted' and keyrings as 'trusted only'.

KEYS: verify a certificate is signed by a 'trusted' key
KEYS: Make the system 'trusted' keyring viewable by userspace
KEYS: Add a 'trusted' flag and a 'trusted only' flag
KEYS: Separate the kernel signature checking keyring from module signing

These patches allow keys carrying asymmetric public keys to be marked as
being 'trusted' and allow keyrings to be marked as only permitting the
addition or linkage of trusted keys.

Keys loaded from hardware during kernel boot or compiled into the kernel
during build are marked as being trusted automatically. New keys can be
loaded at runtime with add_key(). They are checked against the system
keyring contents and if their signatures can be validated with keys that
are already marked trusted, then they are marked trusted also and can
thus be added into the master keyring.

Patches from Mimi Zohar make this usable with the IMA keyrings also.

(3) Remove the date checks on the key used to validate a module signature.

X.509: Remove certificate date checks

It's not reasonable to reject a signature just because the key that it was
generated with is no longer valid datewise - especially if the kernel
hasn't yet managed to set the system clock when the first module is
loaded - so just remove those checks.

(4) Make it simpler to deal with additional X.509 being loaded into the kernel.

KEYS: Load *.x509 files into kernel keyring
KEYS: Have make canonicalise the paths of the X.509 certs better to deduplicate

The builder of the kernel now just places files with the extension ".x509"
into the kernel source or build trees and they're concatenated by the
kernel build and stuffed into the appropriate section.

(5) Add support for userspace kerberos to use keyrings.

KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches
KEYS: Implement a big key type that can save to tmpfs

Fedora went to, by default, storing kerberos tickets and tokens in tmpfs.
We looked at storing it in keyrings instead as that confers certain
advantages such as tickets being automatically deleted after a certain
amount of time and the ability for the kernel to get at these tokens more
easily.

To make this work, two things were needed:

(a) A way for the tickets to persist beyond the lifetime of all a user's
sessions so that cron-driven processes can still use them.

The problem is that a user's session keyrings are deleted when the
session that spawned them logs out and the user's user keyring is
deleted when the UID is deleted (typically when the last log out
happens), so neither of these places is suitable.

I've added a system keyring into which a 'persistent' keyring is
created for each UID on request. Each time a user requests their
persistent keyring, the expiry time on it is set anew. If the user
doesn't ask for it for, say, three days, the keyring is automatically
expired and garbage collected using the existing gc. All the kerberos
tokens it held are then also gc'd.

(b) A key type that can hold really big tickets (up to 1MB in size).

The problem is that Active Directory can return huge tickets with lots
of auxiliary data attached. We don't, however, want to eat up huge
tracts of unswappable kernel space for this, so if the ticket is
greater than a certain size, we create a swappable shmem file and dump
the contents in there and just live with the fact we then have an
inode and a dentry overhead. If the ticket is smaller than that, we
slap it in a kmalloc()'d buffer"

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (121 commits)
KEYS: Fix keyring content gc scanner
KEYS: Fix error handling in big_key instantiation
KEYS: Fix UID check in keyctl_get_persistent()
KEYS: The RSA public key algorithm needs to select MPILIB
ima: define '_ima' as a builtin 'trusted' keyring
ima: extend the measurement list to include the file signature
kernel/system_certificate.S: use real contents instead of macro GLOBAL()
KEYS: fix error return code in big_key_instantiate()
KEYS: Fix keyring quota misaccounting on key replacement and unlink
KEYS: Fix a race between negating a key and reading the error set
KEYS: Make BIG_KEYS boolean
apparmor: remove the "task" arg from may_change_ptraced_domain()
apparmor: remove parent task info from audit logging
apparmor: remove tsk field from the apparmor_audit_struct
apparmor: fix capability to not use the current task, during reporting
Smack: Ptrace access check mode
ima: provide hash algo info in the xattr
ima: enable support for larger default filedata hash algorithms
ima: define kernel parameter 'ima_template=' to change configured default
ima: add Kconfig default measurement list template
...

Linus Torvalds
2013-11-22 11:46:00 +0800
3eaded86a Merge git://git.infradead.org/users/eparis/audit ... Browse Code »

Pull audit updates from Eric Paris:
"Nothing amazing. Formatting, small bug fixes, couple of fixes where
we didn't get records due to some old VFS changes, and a change to how
we collect execve info..."

Fixed conflict in fs/exec.c as per Eric and linux-next.

* git://git.infradead.org/users/eparis/audit: (28 commits)
audit: fix type of sessionid in audit_set_loginuid()
audit: call audit_bprm() only once to add AUDIT_EXECVE information
audit: move audit_aux_data_execve contents into audit_context union
audit: remove unused envc member of audit_aux_data_execve
audit: Kill the unused struct audit_aux_data_capset
audit: do not reject all AUDIT_INODE filter types
audit: suppress stock memalloc failure warnings since already managed
audit: log the audit_names record type
audit: add child record before the create to handle case where create fails
audit: use given values in tty_audit enable api
audit: use nlmsg_len() to get message payload length
audit: use memset instead of trying to initialize field by field
audit: fix info leak in AUDIT_GET requests
audit: update AUDIT_INODE filter rule to comparator function
audit: audit feature to set loginuid immutable
audit: audit feature to only allow unsetting the loginuid
audit: allow unsetting the loginuid (with priv)
audit: remove CONFIG_AUDIT_LOGINUID_IMMUTABLE
audit: loginuid functions coding style
selinux: apply selinux checks on new audit message types
...

Linus Torvalds
2013-11-22 11:18:14 +0800

21 Nov, 2013

2 commits

b5898cd05 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs bits and pieces from Al Viro:
"Assorted bits that got missed in the first pull request + fixes for a
couple of coredump regressions"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fold try_to_ascend() into the sole remaining caller
dcache.c: get rid of pointless macros
take read_seqbegin_or_lock() and friends to seqlock.h
consolidate simple ->d_delete() instances
gfs2: endianness misannotations
dump_emit(): use __kernel_write(), not vfs_write()
dump_align(): fix the dumb braino

Linus Torvalds
2013-11-21 06:25:39 +0800
82023bb7f Merge tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull more ACPI and power management updates from Rafael Wysocki:

- ACPI-based device hotplug fixes for issues introduced recently and a
fix for an older error code path bug in the ACPI PCI host bridge
driver

- Fix for recently broken OMAP cpufreq build from Viresh Kumar

- Fix for a recent hibernation regression related to s2disk

- Fix for a locking-related regression in the ACPI EC driver from
Puneet Kumar

- System suspend error code path fix related to runtime PM and runtime
PM documentation update from Ulf Hansson

- cpufreq's conservative governor fix from Xiaoguang Chen

- New processor IDs for intel_idle and turbostat and removal of an
obsolete Kconfig option from Len Brown

- New device IDs for the ACPI LPSS (Low-Power Subsystem) driver and
ACPI-based PCI hotplug (ACPIPHP) cleanup from Mika Westerberg

- Removal of several ACPI video DMI blacklist entries that are not
necessary any more from Aaron Lu

- Rework of the ACPI companion representation in struct device and code
cleanup related to that change from Rafael J Wysocki, Lan Tianyu and
Jarkko Nikula

- Fixes for assigning names to ACPI-enumerated I2C and SPI devices from
Jarkko Nikula

* tag 'pm+acpi-2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits)
PCI / hotplug / ACPI: Drop unused acpiphp_debug declaration
ACPI / scan: Set flags.match_driver in acpi_bus_scan_fixed()
ACPI / PCI root: Clear driver_data before failing enumeration
ACPI / hotplug: Fix PCI host bridge hot removal
ACPI / hotplug: Fix acpi_bus_get_device() return value check
cpufreq: governor: Remove fossil comment in the cpufreq_governor_dbs()
ACPI / video: clean up DMI table for initial black screen problem
ACPI / EC: Ensure lock is acquired before accessing ec struct members
PM / Hibernate: Do not crash kernel in free_basic_memory_bitmaps()
ACPI / AC: Remove struct acpi_device pointer from struct acpi_ac
spi: Use stable dev_name for ACPI enumerated SPI slaves
i2c: Use stable dev_name for ACPI enumerated I2C slaves
ACPI: Provide acpi_dev_name accessor for struct acpi_device device name
ACPI / bind: Use (put|get)_device() on ACPI device objects too
ACPI: Eliminate the DEVICE_ACPI_HANDLE() macro
ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node
cpufreq: OMAP: Fix compilation error 'r & ret undeclared'
PM / Runtime: Fix error path for prepare
PM / Runtime: Update documentation around probe|remove|suspend
cpufreq: conservative: set requested_freq to policy max when it is over policy max
...

Linus Torvalds
2013-11-21 05:25:04 +0800

20 Nov, 2013

1 commit

1ee2dcc22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"

1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.

We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.

In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.

2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.

3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.

4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.

5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.

6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.

7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.

8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.

9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.

10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.

11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.

12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.

13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.

14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.

15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.

16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.

17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.

18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.

19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.

20) Fix route leak in VTI tunnel transmit, from Fan Du.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...

Linus Torvalds
2013-11-20 07:50:47 +0800