Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

16 Jan, 2015

40 commits

a2ab91876 Linux 3.14.29 Browse Code »

Greg Kroah-Hartman
2015-01-16 23:00:22 +0800
1bec714a0 mm: Don't count the stack guard page towards RLIMIT_STACK ... Browse Code »

commit 690eac53daff34169a4d74fc7bfbd388c4896abb upstream.

Commit fee7e49d4514 ("mm: propagate error from stack expansion even for
guard page") made sure that we return the error properly for stack
growth conditions. It also theorized that counting the guard page
towards the stack limit might break something, but also said "Let's see
if anybody notices".

Somebody did notice. Apparently android-x86 sets the stack limit very
close to the limit indeed, and including the guard page in the rlimit
check causes the android 'zygote' process problems.

So this adds the (fairly trivial) code to make the stack rlimit check be
against the actual real stack size, rather than the size of the vma that
includes the guard page.

Reported-and-tested-by: Chih-Wei Huang
Cc: Jay Foad
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2015-01-16 22:59:35 +0800
11e4f3bfd mm: propagate error from stack expansion even for guard page ... Browse Code »

commit fee7e49d45149fba60156f5b59014f764d3e3728 upstream.

Jay Foad reports that the address sanitizer test (asan) sometimes gets
confused by a stack pointer that ends up being outside the stack vma
that is reported by /proc/maps.

This happens due to an interaction between RLIMIT_STACK and the guard
page: when we do the guard page check, we ignore the potential error
from the stack expansion, which effectively results in a missing guard
page, since the expected stack expansion won't have been done.

And since /proc/maps explicitly ignores the guard page (commit
d7824370e263: "mm: fix up some user-visible effects of the stack guard
page"), the stack pointer ends up being outside the reported stack area.

This is the minimal patch: it just propagates the error. It also
effectively makes the guard page part of the stack limit, which in turn
measn that the actual real stack is one page less than the stack limit.

Let's see if anybody notices. We could teach acct_stack_growth() to
allow an extra page for a grow-up/grow-down stack in the rlimit test,
but I don't want to add more complexity if it isn't needed.

Reported-and-tested-by: Jay Foad
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2015-01-16 22:59:35 +0800
18d9304b8 mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed ... Browse Code »

commit 9e5e3661727eaf960d3480213f8e87c8d67b6956 upstream.

Charles Shirron and Paul Cassella from Cray Inc have reported kswapd
stuck in a busy loop with nothing left to balance, but
kswapd_try_to_sleep() failing to sleep. Their analysis found the cause
to be a combination of several factors:

1. A process is waiting in throttle_direct_reclaim() on pgdat->pfmemalloc_wait

2. The process has been killed (by OOM in this case), but has not yet been
scheduled to remove itself from the waitqueue and die.

3. kswapd checks for throttled processes in prepare_kswapd_sleep():

if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
wake_up(&pgdat->pfmemalloc_wait);
return false; // kswapd will not go to sleep
}

However, for a process that was already killed, wake_up() does not remove
the process from the waitqueue, since try_to_wake_up() checks its state
first and returns false when the process is no longer waiting.

4. kswapd is running on the same CPU as the only CPU that the process is
allowed to run on (through cpus_allowed, or possibly single-cpu system).

5. CONFIG_PREEMPT_NONE=y kernel is used. If there's nothing to balance, kswapd
encounters no voluntary preemption points and repeatedly fails
prepare_kswapd_sleep(), blocking the process from running and removing
itself from the waitqueue, which would let kswapd sleep.

So, the source of the problem is that we prevent kswapd from going to
sleep until there are processes waiting on the pfmemalloc_wait queue,
and a process waiting on a queue is guaranteed to be removed from the
queue only when it gets scheduled. This was done to make sure that no
process is left sleeping on pfmemalloc_wait when kswapd itself goes to
sleep.

However, it isn't necessary to postpone kswapd sleep until the
pfmemalloc_wait queue actually empties. To prevent processes from being
left sleeping, it's actually enough to guarantee that all processes
waiting on pfmemalloc_wait queue have been woken up by the time we put
kswapd to sleep.

This patch therefore fixes this issue by substituting 'wake_up' with
'wake_up_all' and removing 'return false' in the code snippet from
prepare_kswapd_sleep() above. Note that if any process puts itself in
the queue after this waitqueue_active() check, or after the wake up
itself, it means that the process will also wake up kswapd - and since
we are under prepare_to_wait(), the wake up won't be missed. Also we
update the comment prepare_kswapd_sleep() to hopefully more clearly
describe the races it is preventing.

Fixes: 5515061d22f0 ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage")
Signed-off-by: Vlastimil Babka
Signed-off-by: Vladimir Davydov
Cc: Mel Gorman
Cc: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Vlastimil Babka
2015-01-16 22:59:35 +0800
b36cd20d3 mmc: sdhci: Fix sleep in atomic after inserting SD card ... Browse Code »

commit 2836766a9d0bd02c66073f8dd44796e6cc23848d upstream.

Sleep in atomic context happened on Trats2 board after inserting or
removing SD card because mmc_gpio_get_cd() was called under spin lock.

Fix this by moving card detection earlier, before acquiring spin lock.
The mmc_gpio_get_cd() call does not have to be protected by spin lock
because it does not access any sdhci internal data.
The sdhci_do_get_cd() call access host flags (SDHCI_DEVICE_DEAD). After
moving it out side of spin lock it could theoretically race with driver
removal but still there is no actual protection against manual card
eject.

Dmesg after inserting SD card:
[ 41.663414] BUG: sleeping function called from invalid context at drivers/gpio/gpiolib.c:1511
[ 41.670469] in_atomic(): 1, irqs_disabled(): 128, pid: 30, name: kworker/u8:1
[ 41.677580] INFO: lockdep is turned off.
[ 41.681486] irq event stamp: 61972
[ 41.684872] hardirqs last enabled at (61971): [] _raw_spin_unlock_irq+0x24/0x5c
[ 41.693118] hardirqs last disabled at (61972): [] _raw_spin_lock_irq+0x18/0x54
[ 41.701190] softirqs last enabled at (61648): [] __do_softirq+0x234/0x2c8
[ 41.708914] softirqs last disabled at (61631): [] irq_exit+0xd0/0x114
[ 41.716206] Preemption disabled at:[< (null)>] (null)
[ 41.721500]
[ 41.722985] CPU: 3 PID: 30 Comm: kworker/u8:1 Tainted: G W 3.18.0-rc5-next-20141121 #883
[ 41.732111] Workqueue: kmmcd mmc_rescan
[ 41.735945] [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[ 41.743661] [] (show_stack) from [] (dump_stack+0x70/0xbc)
[ 41.750867] [] (dump_stack) from [] (gpiod_get_raw_value_cansleep+0x18/0x30)
[ 41.759628] [] (gpiod_get_raw_value_cansleep) from [] (mmc_gpio_get_cd+0x38/0x58)
[ 41.768821] [] (mmc_gpio_get_cd) from [] (sdhci_request+0x50/0x1a4)
[ 41.776808] [] (sdhci_request) from [] (mmc_start_request+0x138/0x268)
[ 41.785051] [] (mmc_start_request) from [] (mmc_wait_for_req+0x58/0x1a0)
[ 41.793469] [] (mmc_wait_for_req) from [] (mmc_wait_for_cmd+0x58/0x78)
[ 41.801714] [] (mmc_wait_for_cmd) from [] (mmc_io_rw_direct_host+0x98/0x124)
[ 41.810480] [] (mmc_io_rw_direct_host) from [] (sdio_reset+0x2c/0x64)
[ 41.818641] [] (sdio_reset) from [] (mmc_rescan+0x254/0x2e4)
[ 41.826028] [] (mmc_rescan) from [] (process_one_work+0x180/0x3f4)
[ 41.833920] [] (process_one_work) from [] (worker_thread+0x34/0x4b0)
[ 41.841991] [] (worker_thread) from [] (kthread+0xe4/0x104)
[ 41.849285] [] (kthread) from [] (ret_from_fork+0x14/0x2c)
[ 42.038276] mmc0: new high speed SDHC card at address 1234

Signed-off-by: Krzysztof Kozlowski
Fixes: 94144a465dd0 ("mmc: sdhci: add get_cd() implementation")
Signed-off-by: Ulf Hansson
Signed-off-by: Greg Kroah-Hartman

Krzysztof Kozlowski
2015-01-16 22:59:34 +0800
2937d5ac1 spi: fsl: Fix problem with multi message transfers ... Browse Code »

commit 4302a59629f7a0bd70fd1605d2b558597517372a upstream.

When used via spidev with more than one messages to tranfer via
SPI_IOC_MESSAGE the current implementation would return with
-EINVAL, since bits_per_word and speed_hz are set in all
transfer structs. And in the 2nd loop status will stay at
-EINVAL as its not overwritten again via fsl_spi_setup_transfer().

This patch changes this behavious by first checking if one of
the messages uses different settings. If this is the case
the function will return with -EINVAL. If not, the messages
are transferred correctly.

Signed-off-by: Stefan Roese
Signed-off-by: Mark Brown
Cc: Esben Haabendal
Signed-off-by: Greg Kroah-Hartman

Stefan Roese
2015-01-16 22:59:34 +0800
ceaefcdf2 perf session: Do not fail on processing out of order event ... Browse Code »

commit f61ff6c06dc8f32c7036013ad802c899ec590607 upstream.

Linus reported perf report command being interrupted due to processing
of 'out of order' event, with following error:

Timestamp below last timeslice flush
0x5733a8 [0x28]: failed to process type: 3

I could reproduce the issue and in my case it was caused by one CPU
(mmap) being behind during record and userspace mmap reader seeing the
data after other CPUs data were already stored.

This is expected under some circumstances because we need to limit the
number of events that we queue for reordering when we receive a
PERF_RECORD_FINISHED_ROUND or when we force flush due to memory
pressure.

Reported-by: Linus Torvalds
Signed-off-by: Jiri Olsa
Acked-by: Ingo Molnar
Cc: Andi Kleen
Cc: Corey Ashford
Cc: David Ahern
Cc: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Linus Torvalds
Cc: Matt Fleming
Cc: Namhyung Kim
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Stephane Eranian
Link: http://lkml.kernel.org/r/1417016371-30249-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
[zhangzhiqiang: backport to 3.10:
- adjust context
- commit f61ff6c06d struct events_stats was defined in tools/perf/util/event.h
while 3.10 stable defined in tools/perf/util/hist.h.
- 3.10 stable there is no pr_oe_time() which used for debug.
- After the above adjustments, becomes same to the original patch:
https://github.com/torvalds/linux/commit/f61ff6c06dc8f32c7036013ad802c899ec590607
]
Signed-off-by: Zhiqiang Zhang
Signed-off-by: Greg Kroah-Hartman

Jiri Olsa
2015-01-16 22:59:34 +0800
c8af89892 perf: Fix events installation during moving group ... Browse Code »

commit 9fc81d87420d0d3fd62d5e5529972c0ad9eab9cc upstream.

We allow PMU driver to change the cpu on which the event
should be installed to. This happened in patch:

e2d37cd213dc ("perf: Allow the PMU driver to choose the CPU on which to install events")

This patch also forces all the group members to follow
the currently opened events cpu if the group happened
to be moved.

This and the change of event->cpu in perf_install_in_context()
function introduced in:

0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()")

forces group members to change their event->cpu,
if the currently-opened-event's PMU changed the cpu
and there is a group move.

Above behaviour causes problem for breakpoint events,
which uses event->cpu to touch cpu specific data for
breakpoints accounting. By changing event->cpu, some
breakpoints slots were wrongly accounted for given
cpu.

Vinces's perf fuzzer hit this issue and caused following
WARN on my setup:

WARNING: CPU: 0 PID: 20214 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x142/0x150()
Can't find any breakpoint slot
[...]

This patch changes the group moving code to keep the event's
original cpu.

Reported-by: Vince Weaver
Signed-off-by: Jiri Olsa
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Vince Weaver
Cc: Yan, Zheng
Link: http://lkml.kernel.org/r/1418243031-20367-3-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Jiri Olsa
2015-01-16 22:59:34 +0800
ec9c772a1 perf/x86/intel/uncore: Make sure only uncore events are collected ... Browse Code »

commit af91568e762d04931dcbdd6bef4655433d8b9418 upstream.

The uncore_collect_events functions assumes that event group
might contain only uncore events which is wrong, because it
might contain any type of events.

This bug leads to uncore framework touching 'not' uncore events,
which could end up all sorts of bugs.

One was triggered by Vince's perf fuzzer, when the uncore code
touched breakpoint event private event space as if it was uncore
event and caused BUG:

BUG: unable to handle kernel paging request at ffffffff82822068
IP: [] uncore_assign_events+0x188/0x250
...

The code in uncore_assign_events() function was looking for
event->hw.idx data while the event was initialized as a
breakpoint with different members in event->hw union.

This patch forces uncore_collect_events() to collect only uncore
events.

Reported-by: Vince Weaver
Signed-off-by: Jiri Olsa
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Yan, Zheng
Link: http://lkml.kernel.org/r/1418243031-20367-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Jiri Olsa
2015-01-16 22:59:34 +0800
334173869 Btrfs: don't delay inode ref updates during log replay ... Browse Code »
5

commit 6f8960541b1eb6054a642da48daae2320fddba93 upstream.

Commit 1d52c78afbb (Btrfs: try not to ENOSPC on log replay) added a
check to skip delayed inode updates during log replay because it
confuses the enospc code. But the delayed processing will end up
ignoring delayed refs from log replay because the inode itself wasn't
put through the delayed code.

This can end up triggering a warning at commit time:

WARNING: CPU: 2 PID: 778 at fs/btrfs/delayed-inode.c:1410 btrfs_assert_delayed_root_empty+0x32/0x34()

Which is repeated for each commit because we never process the delayed
inode ref update.

The fix used here is to change btrfs_delayed_delete_inode_ref to return
an error if we're currently in log replay. The caller will do the ref
deletion immediately and everything will work properly.

Signed-off-by: Chris Mason
Signed-off-by: Greg Kroah-Hartman

Chris Mason
2015-01-16 22:59:34 +0800
852cacf6b arm64: kernel: fix __cpu_suspend mm switch on warm-boot ... Browse Code »

commit f43c27188a49111b58e9611afa2f0365b0b55625 upstream.

On arm64 the TTBR0_EL1 register is set to either the reserved TTBR0
page tables on boot or to the active_mm mappings belonging to user space
processes, it must never be set to swapper_pg_dir page tables mappings.

When a CPU is booted its active_mm is set to init_mm even though its
TTBR0_EL1 points at the reserved TTBR0 page mappings. This implies
that when __cpu_suspend is triggered the active_mm can point at
init_mm even if the current TTBR0_EL1 register contains the reserved
TTBR0_EL1 mappings.

Therefore, the mm save and restore executed in __cpu_suspend might
turn out to be erroneous in that, if the current->active_mm corresponds
to init_mm, on resume from low power it ends up restoring in the
TTBR0_EL1 the init_mm mappings that are global and can cause speculation
of TLB entries which end up being propagated to user space.

This patch fixes the issue by checking the active_mm pointer before
restoring the TTBR0 mappings. If the current active_mm == &init_mm,
the code sets the TTBR0_EL1 to the reserved TTBR0 mapping instead of
switching back to the active_mm, which is the expected behaviour
corresponding to the TTBR0_EL1 settings when __cpu_suspend was entered.

Fixes: 95322526ef62 ("arm64: kernel: cpu_{suspend/resume} implementation")
Cc: Will Deacon
Signed-off-by: Lorenzo Pieralisi
Signed-off-by: Catalin Marinas
Signed-off-by: Greg Kroah-Hartman

Lorenzo Pieralisi
2015-01-16 22:59:34 +0800
219591c57 arm64: Move cpu_resume into the text section ... Browse Code »

commit c3684fbb446501b48dec6677a6a9f61c215053de upstream.

The function cpu_resume currently lives in the .data section.
There's no reason for it to be there since we can use relative
instructions without a problem. Move a few cpu_resume data
structures out of the assembly file so the .data annotation
can be dropped completely and cpu_resume ends up in the read
only text section.

Reviewed-by: Kees Cook
Reviewed-by: Mark Rutland
Reviewed-by: Lorenzo Pieralisi
Tested-by: Mark Rutland
Tested-by: Lorenzo Pieralisi
Tested-by: Kees Cook
Acked-by: Ard Biesheuvel
Signed-off-by: Laura Abbott
Signed-off-by: Will Deacon
Signed-off-by: Greg Kroah-Hartman

Laura Abbott
2015-01-16 22:59:34 +0800
0e42d84ba arm64: kernel: refactor the CPU suspend API for retention states ... Browse Code »

commit 714f59925595b9c2ea9c22b107b340d38e3b3bc9 upstream.

CPU suspend is the standard kernel interface to be used to enter
low-power states on ARM64 systems. Current cpu_suspend implementation
by default assumes that all low power states are losing the CPU context,
so the CPU registers must be saved and cleaned to DRAM upon state
entry. Furthermore, the current cpu_suspend() implementation assumes
that if the CPU suspend back-end method returns when called, this has
to be considered an error regardless of the return code (which can be
successful) since the CPU was not expected to return from a code path that
is different from cpu_resume code path - eg returning from the reset vector.

All in all this means that the current API does not cope well with low-power
states that preserve the CPU context when entered (ie retention states),
since first of all the context is saved for nothing on state entry for
those states and a successful state entry can return as a normal function
return, which is considered an error by the current CPU suspend
implementation.

This patch refactors the cpu_suspend() API so that it can be split in
two separate functionalities. The arm64 cpu_suspend API just provides
a wrapper around CPU suspend operation hook. A new function is
introduced (for architecture code use only) for states that require
context saving upon entry:

__cpu_suspend(unsigned long arg, int (*fn)(unsigned long))

__cpu_suspend() saves the context on function entry and calls the
so called suspend finisher (ie fn) to complete the suspend operation.
The finisher is not expected to return, unless it fails in which case
the error is propagated back to the __cpu_suspend caller.

The API refactoring results in the following pseudo code call sequence for a
suspending CPU, when triggered from a kernel subsystem:

/*
* int cpu_suspend(unsigned long idx)
* @idx: idle state index
*/
{
-> cpu_suspend(idx)
|---> CPU operations suspend hook called, if present
|--> if (retention_state)
|--> direct suspend back-end call (eg PSCI suspend)
else
|--> __cpu_suspend(idx, &back_end_finisher);
}

By refactoring the cpu_suspend API this way, the CPU operations back-end
has a chance to detect whether idle states require state saving or not
and can call the required suspend operations accordingly either through
simple function call or indirectly through __cpu_suspend() which carries out
state saving and suspend finisher dispatching to complete idle state entry.

Reviewed-by: Catalin Marinas
Reviewed-by: Hanjun Guo
Signed-off-by: Lorenzo Pieralisi
Signed-off-by: Catalin Marinas
Signed-off-by: Greg Kroah-Hartman

Lorenzo Pieralisi
2015-01-16 22:59:34 +0800
5ef30fef0 arm64: kernel: add missing __init section marker to cpu_suspend_init ... Browse Code »

commit 18ab7db6b749ac27aac08d572afbbd2f4d937934 upstream.

Suspend init function must be marked as __init, since it is not needed
after the kernel has booted. This patch moves the cpu_suspend_init()
function to the __init section.

Signed-off-by: Lorenzo Pieralisi
Signed-off-by: Catalin Marinas
Signed-off-by: Greg Kroah-Hartman

Lorenzo Pieralisi
2015-01-16 22:59:34 +0800
47abb28e7 ACPI / PM: Fix PM initialization for devices that are not present ... Browse Code »

commit 1b1f3e1699a9886f1070f94171097ab4ccdbfc95 upstream.

If an ACPI device object whose _STA returns 0 (not present and not
functional) has _PR0 or _PS0, its power_manageable flag will be set
and acpi_bus_init_power() will return 0 for it. Consequently, if
such a device object is passed to the ACPI device PM functions, they
will attempt to carry out the requested operation on the device,
although they should not do that for devices that are not present.

To fix that problem make acpi_bus_init_power() return an error code
for devices that are not present which will cause power_manageable to
be cleared for them as appropriate in acpi_bus_get_power_flags().
However, the lists of power resources should not be freed for the
device in that case, so modify acpi_bus_get_power_flags() to keep
those lists even if acpi_bus_init_power() returns an error.
Accordingly, when deciding whether or not the lists of power
resources need to be freed, acpi_free_power_resources_lists()
should check the power.flags.power_resources flag instead of
flags.power_manageable, so make that change too.

Furthermore, if acpi_bus_attach() sees that flags.initialized is
unset for the given device, it should reset the power management
settings of the device and re-initialize them from scratch instead
of relying on the previous settings (the device may have appeared
after being not present previously, for example), so make it use
the 'valid' flag of the D0 power state as the initial value of
flags.power_manageable for it and call acpi_bus_init_power() to
discover its current power state.

Signed-off-by: Rafael J. Wysocki
Reviewed-by: Mika Westerberg
Signed-off-by: Greg Kroah-Hartman

Rafael J. Wysocki
2015-01-16 22:59:34 +0800
8d33f5146 ARM: mvebu: disable I/O coherency on non-SMP situations on Armada 370/375/38x/XP ... Browse Code »

commit e55355453600a33bb5ca4f71f2d7214875f3b061 upstream.

Enabling the hardware I/O coherency on Armada 370, Armada 375, Armada
38x and Armada XP requires a certain number of conditions:

- On Armada 370, the cache policy must be set to write-allocate.

- On Armada 375, 38x and XP, the cache policy must be set to
write-allocate, the pages must be mapped with the shareable
attribute, and the SMP bit must be set

Currently, on Armada XP, when CONFIG_SMP is enabled, those conditions
are met. However, when Armada XP is used in a !CONFIG_SMP kernel, none
of these conditions are met. With Armada 370, the situation is worse:
since the processor is single core, regardless of whether CONFIG_SMP
or !CONFIG_SMP is used, the cache policy will be set to write-back by
the kernel and not write-allocate.

Since solving this problem turns out to be quite complicated, and we
don't want to let users with a mainline kernel known to have
infrequent but existing data corruptions, this commit proposes to
simply disable hardware I/O coherency in situations where it is known
not to work.

And basically, the is_smp() function of the kernel tells us whether it
is OK to enable hardware I/O coherency or not, so this commit slightly
refactors the coherency_type() function to return
COHERENCY_FABRIC_TYPE_NONE when is_smp() is false, or the appropriate
type of the coherency fabric in the other case.

Thanks to this, the I/O coherency fabric will no longer be used at all
in !CONFIG_SMP configurations. It will continue to be used in
CONFIG_SMP configurations on Armada XP, Armada 375 and Armada 38x
(which are multiple cores processors), but will no longer be used on
Armada 370 (which is a single core processor).

In the process, it simplifies the implementation of the
coherency_type() function, and adds a missing call to of_node_put().

Signed-off-by: Thomas Petazzoni
Fixes: e60304f8cb7bb545e79fe62d9b9762460c254ec2 ("arm: mvebu: Add hardware I/O Coherency support")
Acked-by: Gregory CLEMENT
Link: https://lkml.kernel.org/r/1415871540-20302-3-git-send-email-thomas.petazzoni@free-electrons.com
Signed-off-by: Jason Cooper
Signed-off-by: Greg Kroah-Hartman

Thomas Petazzoni
2015-01-16 22:59:34 +0800
3f4ddf1a9 Revert "ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo" ... Browse Code »

commit 4bf9636c39ac70da091d5a2e28d3448eaa7f115c upstream.

Commit 9fc2105aeaaf ("ARM: 7830/1: delay: don't bother reporting
bogomips in /proc/cpuinfo") breaks audio in python, and probably
elsewhere, with message

FATAL: cannot locate cpu MHz in /proc/cpuinfo

I'm not the first one to hit it, see for example

https://theredblacktree.wordpress.com/2014/08/10/fatal-cannot-locate-cpu-mhz-in-proccpuinfo/
https://devtalk.nvidia.com/default/topic/765800/workaround-for-fatal-cannot-locate-cpu-mhz-in-proc-cpuinf/?offset=1

Reading original changelog, I have to say "Stop breaking working setups.
You know who you are!".

Signed-off-by: Pavel Machek
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Pavel Machek
2015-01-16 22:59:34 +0800
dab35042e ARM: OMAP4: PM: Only do static dependency configuration in omap4_init_static_deps ... Browse Code »

commit 9008d83fe9dc2e0f19b8ba17a423b3759d8e0fd7 upstream.

Commit 705814b5ea6f ("ARM: OMAP4+: PM: Consolidate OMAP4 PM code to
re-use it for OMAP5")

Moved logic generic for OMAP5+ as part of the init routine by
introducing omap4_pm_init. However, the patch left the powerdomain
initial setup, an unused omap4430 es1.0 check and a spurious log
"Power Management for TI OMAP4." in the original code.

Remove the duplicate code which is already present in omap4_pm_init from
omap4_init_static_deps.

As part of this change, also move the u-boot version print out of the
static dependency function to the omap4_pm_init function.

Fixes: 705814b5ea6f ("ARM: OMAP4+: PM: Consolidate OMAP4 PM code to re-use it for OMAP5")
Signed-off-by: Nishanth Menon
Signed-off-by: Tony Lindgren
Signed-off-by: Greg Kroah-Hartman

Nishanth Menon
2015-01-16 22:59:33 +0800
8be474538 ARM: dts: Enable PWM node by default for s3c64xx ... Browse Code »

commit 5e794de514f56de1e78e979ca09c56a91aa2e9f1 upstream.

The PWM block is required for system clock source so it must be always
enabled. This patch fixes boot issues on SMDK6410 which did not have
the node enabled explicitly for other purposes.

Fixes: eeb93d02 ("clocksource: of: Respect device tree node status")

Signed-off-by: Tomasz Figa
Signed-off-by: Kukjin Kim
Signed-off-by: Greg Kroah-Hartman

Tomasz Figa
2015-01-16 22:59:33 +0800
9feeb8f38 ARM: dts: DRA7: wdt: Fix compatible property for watchdog node ... Browse Code »

commit be6688350a4470e417aaeca54d162652aab40ac5 upstream.

OMAP wdt driver supports only ti,omap3-wdt compatible. In DRA7 dt
wdt compatible property is defined as ti,omap4-wdt by mistake instead of
ti,omap3-wdt. Correcting the typo.

Fixes: 6e58b8f1daaf1a ("ARM: dts: DRA7: Add the dts files for dra7 SoC and dra7-evm board")
Signed-off-by: Lokesh Vutla
Signed-off-by: Tony Lindgren
Signed-off-by: Greg Kroah-Hartman

Lokesh Vutla
2015-01-16 22:59:33 +0800
cae817ad6 sched/deadline: Avoid double-accounting in case of missed deadlines ... Browse Code »

commit 269ad8015a6b2bb1cf9e684da4921eb6fa0a0c88 upstream.

The dl_runtime_exceeded() function is supposed to ckeck if
a SCHED_DEADLINE task must be throttled, by checking if its
current runtime is
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Juri Lelli
Cc: Dario Faggioli
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1418813432-20797-3-git-send-email-luca.abeni@unitn.it
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Luca Abeni
2015-01-16 22:59:33 +0800
678c8bb7a sched/deadline: Fix migration of SCHED_DEADLINE tasks ... Browse Code »

commit 6a503c3be937d275113b702e0421e5b0720abe8a upstream.

According to global EDF, tasks should be migrated between runqueues
without checking if their scheduling deadlines and runtimes are valid.
However, SCHED_DEADLINE currently performs such a check:
a migration happens doing:

deactivate_task(rq, next_task, 0);
set_task_cpu(next_task, later_rq->cpu);
activate_task(later_rq, next_task, 0);

which ends up calling dequeue_task_dl(), setting the new CPU, and then
calling enqueue_task_dl().

enqueue_task_dl() then calls enqueue_dl_entity(), which calls
update_dl_entity(), which can modify scheduling deadline and runtime,
breaking global EDF scheduling.

As a result, some of the properties of global EDF are not respected:
for example, a taskset {(30, 80), (40, 80), (120, 170)} scheduled on
two cores can have unbounded response times for the third task even
if 30/80+40/80+120/170 = 1.5809 < 2

This can be fixed by invoking update_dl_entity() only in case of
wakeup, or if this is a new SCHED_DEADLINE task.

Signed-off-by: Luca Abeni
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Juri Lelli
Cc: Dario Faggioli
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/1418813432-20797-2-git-send-email-luca.abeni@unitn.it
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Luca Abeni
2015-01-16 22:59:33 +0800
11d1b5db2 scripts/kernel-doc: don't eat struct members with __aligned ... Browse Code »

commit 7b990789a4c3420fa57596b368733158e432d444 upstream.

The change from \d+ to .+ inside __aligned() means that the following
structure:

struct test {
u8 a __aligned(2);
u8 b __aligned(2);
};

essentially gets modified to

struct test {
u8 a;
};

for purposes of kernel-doc, thus dropping a struct member, which in
turns causes warnings and invalid kernel-doc generation.

Fix this by replacing the catch-all (".") with anything that's not a
semicolon ("[^;]").

Fixes: 9dc30918b23f ("scripts/kernel-doc: handle struct member __aligned without numbers")
Signed-off-by: Johannes Berg
Cc: Nishanth Menon
Cc: Randy Dunlap
Cc: Michal Marek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Johannes Berg
2015-01-16 22:59:33 +0800
1cde12540 nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races ... Browse Code »

commit 705304a863cc41585508c0f476f6d3ec28cf7e00 upstream.

Same story as in commit 41080b5a2401 ("nfsd race fixes: ext2") (similar
ext2 fix) except that nilfs2 needs to use insert_inode_locked4() instead
of insert_inode_locked() and a bug of a check for dead inodes needs to
be fixed.

If nilfs_iget() is called from nfsd after nilfs_new_inode() calls
insert_inode_locked4(), nilfs_iget() will wait for unlock_new_inode() at
the end of nilfs_mkdir()/nilfs_create()/etc to unlock the inode.

If nilfs_iget() is called before nilfs_new_inode() calls
insert_inode_locked4(), it will create an in-core inode and read its
data from the on-disk inode. But, nilfs_iget() will find i_nlink equals
zero and fail at nilfs_read_inode_common(), which will lead it to call
iget_failed() and cleanly fail.

However, this sanity check doesn't work as expected for reused on-disk
inodes because they leave a non-zero value in i_mode field and it
hinders the test of i_nlink. This patch also fixes the issue by
removing the test on i_mode that nilfs2 doesn't need.

Signed-off-by: Ryusuke Konishi
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Ryusuke Konishi
2015-01-16 22:59:33 +0800
b9c2571e0 mtd: tests: abort torturetest on erase errors ... Browse Code »

commit 68f29815034e9dc9ed53cad85946c32b07adc8cc upstream.

The torture test should quit once it actually induces an error in the
flash. This step was accidentally removed during refactoring.

Without this fix, the torturetest just continues infinitely, or until
the maximum cycle count is reached. e.g.:

...
[ 7619.218171] mtd_test: error -5 while erasing EB 100
[ 7619.297981] mtd_test: error -5 while erasing EB 100
[ 7619.377953] mtd_test: error -5 while erasing EB 100
[ 7619.457998] mtd_test: error -5 while erasing EB 100
[ 7619.537990] mtd_test: error -5 while erasing EB 100
...

Fixes: 6cf78358c94f ("mtd: mtd_torturetest: use mtd_test helpers")
Signed-off-by: Brian Norris
Cc: Akinobu Mita
Signed-off-by: Greg Kroah-Hartman

Brian Norris
2015-01-16 22:59:33 +0800
92a34c870 ceph: fix null pointer dereference in discard_cap_releases() ... Browse Code »

commit 00bd8edb861eb41d274938cfc0338999d9c593a3 upstream.

send_mds_reconnect() may call discard_cap_releases() after all
release messages have been dropped by cleanup_cap_releases()

Signed-off-by: Yan, Zheng
Reviewed-by: Sage Weil
Cc: Markus Blank-Burian
Signed-off-by: Greg Kroah-Hartman

Yan, Zheng
2015-01-16 22:59:33 +0800
e3de52b76 ceph: do_sync is never initialized ... Browse Code »

commit 021b77bee210843bed1ea91b5cad58235ff9c8e5 upstream.

Probably this code was syncing a lot more often then intended because
the do_sync variable wasn't set to zero.

Fixes: c62988ec0910 ('ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.')
Signed-off-by: Dan Carpenter
Signed-off-by: Ilya Dryomov
Signed-off-by: Greg Kroah-Hartman

Dan Carpenter
2015-01-16 22:59:33 +0800
0cd81f595 nfsd4: fix xdr4 inclusion of escaped char ... Browse Code »

commit 5a64e56976f1ba98743e1678c0029a98e9034c81 upstream.

Fix a bug where nfsd4_encode_components_esc() includes the esc_end char as
an additional string encoding.

Signed-off-by: Benjamin Coddington
Fixes: e7a0444aef4a "nfsd: add IPv6 addr escaping to fs_location hosts"
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Benjamin Coddington
2015-01-16 22:59:33 +0800
d7d13fde8 fs: nfsd: Fix signedness bug in compare_blob ... Browse Code »

commit ef17af2a817db97d42dd2ec0a425231748e23dbc upstream.

Bugs similar to the one in acbbe6fbb240 (kcmp: fix standard comparison
bug) are in rich supply.

In this variant, the problem is that struct xdr_netobj::len has type
unsigned int, so the expression o1->len - o2->len _also_ has type
unsigned int; it has completely well-defined semantics, and the result
is some non-negative integer, which is always representable in a long
long. But this means that if the conditional triggers, we are
guaranteed to return a positive value from compare_blob.

In this case it could be fixed by

- res = o1->len - o2->len;
+ res = (long long)o1->len - (long long)o2->len;

but I'd rather eliminate the usually broken 'return a - b;' idiom.

Reviewed-by: Jeff Layton
Signed-off-by: Rasmus Villemoes
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Rasmus Villemoes
2015-01-16 22:59:33 +0800
8a8c38c8e Drivers: hv: vmbus: Fix a race condition when unregistering a device ... Browse Code »

commit 04a258c162a85c0f4ae56be67634dc43c9a4fa9b upstream.

When build with Debug the following crash is sometimes observed:
Call Trace:
[] string+0x40/0x100
[] vsnprintf+0x218/0x5e0
[] ? trace_hardirqs_off+0xd/0x10
[] vscnprintf+0x11/0x30
[] vprintk+0xd0/0x5c0
[] ? vmbus_process_rescind_offer+0x0/0x110 [hv_vmbus]
[] printk+0x41/0x45
[] vmbus_device_unregister+0x2c/0x40 [hv_vmbus]
[] vmbus_process_rescind_offer+0x2b/0x110 [hv_vmbus]
...

This happens due to the following race: between 'if (channel->device_obj)' check
in vmbus_process_rescind_offer() and pr_debug() in vmbus_device_unregister() the
device can disappear. Fix the issue by taking an additional reference to the
device before proceeding to vmbus_device_unregister().

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2015-01-16 22:59:32 +0800
7810e6d8f n_tty: Fix read_buf race condition, increment read_head after pushing data ... Browse Code »

commit 8bfbe2de769afda051c56aba5450391670e769fc upstream.

Commit 19e2ad6a09f0c06dbca19c98e5f4584269d913dd ("n_tty: Remove overflow
tests from receive_buf() path") moved the increment of read_head into
the arguments list of read_buf_addr(). Function calls represent a
sequence point in C. Therefore read_head is incremented before the
character c is placed in the buffer. Since the circular read buffer is
a lock-less design since commit 6d76bd2618535c581f1673047b8341fd291abc67
("n_tty: Make N_TTY ldisc receive path lockless"), this creates a race
condition that leads to communication errors.

This patch modifies the code to increment read_head _after_ the data
is placed in the buffer and thus fixes the race for non-SMP machines.
To fix the problem for SMP machines, memory barriers must be added in
a separate patch.

Signed-off-by: Christian Riesch
Signed-off-by: Greg Kroah-Hartman

Christian Riesch
2015-01-16 22:59:32 +0800
24003a22e serial: samsung: wait for transfer completion before clock disable ... Browse Code »

commit 1ff383a4c3eda8893ec61b02831826e1b1f46b41 upstream.

This patch adds waiting until transmit buffer and shifter will be empty
before clock disabling.

Without this fix it's possible to have clock disabled while data was
not transmited yet, which causes unproper state of TX line and problems
in following data transfers.

Signed-off-by: Robert Baldyga
Signed-off-by: Greg Kroah-Hartman

Robert Baldyga
2015-01-16 22:59:32 +0800
37b2a5a7c tracing/sched: Check preempt_count() for current when reading task->state ... Browse Code »

commit aee4e5f3d3abb7a2239dd02f6d8fb173413fd02f upstream.

When recording the state of a task for the sched_switch tracepoint a check of
task_preempt_count() is performed to see if PREEMPT_ACTIVE is set. This is
because, technically, a task being preempted is really in the TASK_RUNNING
state, and that is what should be recorded when tracing a sched_switch,
even if the task put itself into another state (it hasn't scheduled out
in that state yet).

But with the change to use per_cpu preempt counts, the
task_thread_info(p)->preempt_count is no longer used, and instead
task_preempt_count(p) is used.

The problem is that this does not use the current preempt count but a stale
one from a previous sched_switch. The task_preempt_count(p) uses
saved_preempt_count and not preempt_count(). But for tracing sched_switch,
if p is current, we really want preempt_count().

I hit this bug when I was tracing sleep and the call from do_nanosleep()
scheduled out in the "RUNNING" state.

sleep-4290 [000] 537272.259992: sched_switch: sleep:4290 [120] R ==> swapper/0:0 [120]
sleep-4290 [000] 537272.260015: kernel_stack:
=> __schedule (ffffffff8150864a)
=> schedule (ffffffff815089f8)
=> do_nanosleep (ffffffff8150b76c)
=> hrtimer_nanosleep (ffffffff8108d66b)
=> SyS_nanosleep (ffffffff8108d750)
=> return_to_handler (ffffffff8150e8e5)
=> tracesys_phase2 (ffffffff8150c844)

After a bit of hair pulling, I found that the state was really
TASK_INTERRUPTIBLE, but the saved_preempt_count had an old PREEMPT_ACTIVE
set and caused the sched_switch tracepoint to show it as RUNNING.

Link: http://lkml.kernel.org/r/20141210174428.3cb7542a@gandalf.local.home

Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Fixes: 01028747559a "sched: Create more preempt_count accessors"
Signed-off-by: Steven Rostedt
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (Red Hat)
2015-01-16 22:59:32 +0800
adfd11493 writeback: fix a subtle race condition in I_DIRTY clearing ... Browse Code »

commit 9c6ac78eb3521c5937b2dd8a7d1b300f41092f45 upstream.

After invoking ->dirty_inode(), __mark_inode_dirty() does smp_mb() and
tests inode->i_state locklessly to see whether it already has all the
necessary I_DIRTY bits set. The comment above the barrier doesn't
contain any useful information - memory barriers can't ensure "changes
are seen by all cpus" by itself.

And it sure enough was broken. Please consider the following
scenario.

CPU 0 CPU 1
-------------------------------------------------------------------------------

enters __writeback_single_inode()
grabs inode->i_lock
tests PAGECACHE_TAG_DIRTY which is clear
enters __set_page_dirty()
grabs mapping->tree_lock
sets PAGECACHE_TAG_DIRTY
releases mapping->tree_lock
leaves __set_page_dirty()

enters __mark_inode_dirty()
smp_mb()
sees I_DIRTY_PAGES set
leaves __mark_inode_dirty()
clears I_DIRTY_PAGES
releases inode->i_lock

Now @inode has dirty pages w/ I_DIRTY_PAGES clear. This doesn't seem
to lead to an immediately critical problem because requeue_inode()
later checks PAGECACHE_TAG_DIRTY instead of I_DIRTY_PAGES when
deciding whether the inode needs to be requeued for IO and there are
enough unintentional memory barriers inbetween, so while the inode
ends up with inconsistent I_DIRTY_PAGES flag, it doesn't fall off the
IO list.

The lack of explicit barrier may also theoretically affect the other
I_DIRTY bits which deal with metadata dirtiness. There is no
guarantee that a strong enough barrier exists between
I_DIRTY_[DATA]SYNC clearing and write_inode() writing out the dirtied
inode. Filesystem inode writeout path likely has enough stuff which
can behave as full barrier but it's theoretically possible that the
writeout may not see all the updates from ->dirty_inode().

Fix it by adding an explicit smp_mb() after I_DIRTY clearing. Note
that I_DIRTY_PAGES needs a special treatment as it always needs to be
cleared to be interlocked with the lockless test on
__mark_inode_dirty() side. It's cleared unconditionally and
reinstated after smp_mb() if the mapping still has dirty pages.

Also add comments explaining how and why the barriers are paired.

Lightly tested.

Signed-off-by: Tejun Heo
Cc: Jan Kara
Cc: Mikulas Patocka
Cc: Jens Axboe
Cc: Al Viro
Reviewed-by: Jan Kara
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Tejun Heo
2015-01-16 22:59:32 +0800
7756c3147 cdc-acm: memory leak in error case ... Browse Code »

commit d908f8478a8d18e66c80a12adb27764920c1f1ca upstream.

If probe() fails not only the attributes need to be removed
but also the memory freed.

Reported-by: Ahmed Tamrawi
Signed-off-by: Oliver Neukum
Signed-off-by: Greg Kroah-Hartman

Oliver Neukum
2015-01-16 22:59:32 +0800
0fa799921 genhd: check for int overflow in disk_expand_part_tbl() ... Browse Code »

commit 5fabcb4c33fe11c7e3afdf805fde26c1a54d0953 upstream.

We can get here from blkdev_ioctl() -> blkpg_ioctl() -> add_partition()
with a user passed in partno value. If we pass in 0x7fffffff, the
new target in disk_expand_part_tbl() overflows the 'int' and we
access beyond the end of ptbl->part[] and even write to it when we
do the rcu_assign_pointer() to assign the new partition.

Reported-by: David Ramos
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Jens Axboe
2015-01-16 22:59:32 +0800
25f9cb00a Add USB_EHCI_EXYNOS to multi_v7_defconfig ... Browse Code »

commit 007487f1fd43d84f26cda926081ca219a24ecbc4 upstream.

Currently we enable Exynos devices in the multi v7 defconfig, however, when
testing on my ODROID-U3, I noticed that USB was not working. Enabling this
option causes USB to work, which enables networking support as well since the
ODROID-U3 has networking on the USB bus.

[arnd] Support for odroid-u3 was added in 3.10, so it would be nice to
backport this fix at least that far.

Signed-off-by: Steev Klimaszewski
Signed-off-by: Arnd Bergmann
Signed-off-by: Greg Kroah-Hartman

Steev Klimaszewski
2015-01-16 22:59:32 +0800
77e2e4877 USB: cdc-acm: check for valid interfaces ... Browse Code »

commit 403dff4e2c94f275e24fd85f40b2732ffec268a1 upstream.

We need to check that we have both a valid data and control inteface for both
types of headers (union and not union.)

References: https://bugzilla.kernel.org/show_bug.cgi?id=83551
Reported-by: Simon Schubert
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2015-01-16 22:59:32 +0800
628b77611 ALSA: hda - Fix wrong gpio_dir & gpio_mask hint setups for IDT/STAC codecs ... Browse Code »

commit c507de88f6a336bd7296c9ec0073b2d4af8b4f5e upstream.

stac_store_hints() does utterly wrong for masking the values for
gpio_dir and gpio_data, likely due to copy&paste errors. Fortunately,
this feature is used very rarely, so the impact must be really small.

Reported-by: Rasmus Villemoes
Signed-off-by: Takashi Iwai
Signed-off-by: Greg Kroah-Hartman

Takashi Iwai
2015-01-16 22:59:32 +0800
b9de34804 ALSA: hda - using uninitialized data ... Browse Code »

commit 69eba10e606a80665f8573221fec589430d9d1cb upstream.

In olden times the snd_hda_param_read() function always set "*start_id"
but in 2007 we introduced a new return and it causes uninitialized data
bugs in a couple of the callers: print_codec_info() and
hdmi_parse_codec().

Fixes: e8a7f136f5ed ('[ALSA] hda-intel - Improve HD-audio codec probing robustness')
Signed-off-by: Dan Carpenter
Signed-off-by: Takashi Iwai
Signed-off-by: Greg Kroah-Hartman

Dan Carpenter
2015-01-16 22:59:32 +0800