Eric Lee / smarc-fsl-linux-kernel

13 Nov, 2015

1 commit

7dac7102a Merge tag 'for-4.4' of git://git.osdn.jp/gitroot/uclinux-h8/linux ... Browse Code »

Pull h8300 updates from Yoshinori Sato:
"Some bug fixes"

* tag 'for-4.4' of git://git.osdn.jp/gitroot/uclinux-h8/linux:
h8300: enable CLKSRC_OF
h8300: Don't set CROSS_COMPILE unconditionally
asm-generic: {get,put}_user ptr argument evaluate only 1 time
h8300: bit io fix
h8300: zImage fix
h8300: register address fix
h8300: Fix alignment for .data
h8300: unaligned divcr register support.

Linus Torvalds
2015-11-13 07:26:39 +0800

11 Nov, 2015

1 commit

247e75dba pci: remove pci_dma_supported ... Browse Code »

Signed-off-by: Christoph Hellwig
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2015-11-11 08:32:11 +0800

08 Nov, 2015

1 commit

a02613a4b asm-generic: {get,put}_user ptr argument evaluate only 1 time ... Browse Code »

Current implemantation ptr argument evaluate 2 times.
It'll be an unexpected result.

Changes v5:
Remove unnecessary const.
Changes v4:
Temporary pointer type change to const void*
Changes v3:
Some build error fix.
Changes v2:
Argument x protect.

Signed-off-by: Yoshinori Sato

Yoshinori Sato
2015-11-08 21:44:42 +0800

07 Nov, 2015

1 commit

9cf5c095b Merge tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic ... Browse Code »

Pull asm-generic cleanups from Arnd Bergmann:
"The asm-generic changes for 4.4 are mostly a series from Christoph
Hellwig to clean up various abuses of headers in there. The patch to
rename the io-64-nonatomic-*.h headers caused some conflicts with new
users, so I added a workaround that we can remove in the next merge
window.

The only other patch is a warning fix from Marek Vasut"

* tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: temporarily add back asm-generic/io-64-nonatomic*.h
asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations
gpio-mxc: stop including
n_tracesink: stop including
n_tracerouter: stop including
mlx5: stop including
hifn_795x: stop including
drbd: stop including
move count_zeroes.h out of asm-generic
move io-64-nonatomic*.h out of asm-generic

Linus Torvalds
2015-11-07 06:22:15 +0800

05 Nov, 2015

2 commits

0d51ce9ca Merge tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull power management and ACPI updates from Rafael Wysocki:
"Quite a new features are included this time.

First off, the Collaborative Processor Performance Control interface
(version 2) defined by ACPI will now be supported on ARM64 along with
a cpufreq frontend for CPU performance scaling.

Second, ACPI gets a new infrastructure for the early probing of IRQ
chips and clock sources (along the lines of the existing similar
mechanism for DT).

Next, the ACPI core and the generic device properties API will now
support a recently introduced hierarchical properties extension of the
_DSD (Device Specific Data) ACPI device configuration object. If the
ACPI platform firmware uses that extension to organize device
properties in a hierarchical way, the kernel will automatically handle
it and make those properties available to device drivers via the
generic device properties API.

It also will be possible to build the ACPICA's AML interpreter
debugger into the kernel now and use that to diagnose AML-related
problems more efficiently. In the future, this should make it
possible to single-step AML execution and do similar things.
Interesting stuff, although somewhat experimental at this point.

Finally, the PM core gets a new mechanism that can be used by device
drivers to distinguish between suspend-to-RAM (based on platform
firmware support) and suspend-to-idle (or other variants of system
suspend the platform firmware is not involved in) and possibly
optimize their device suspend/resume handling accordingly.

In addition to that, some existing features are re-organized quite
substantially.

First, the ACPI-based handling of PCI host bridges on x86 and ia64 is
unified and the common code goes into the ACPI core (so as to reduce
code duplication and eliminate non-essential differences between the
two architectures in that area).

Second, the Operating Performance Points (OPP) framework is
reorganized to make the code easier to find and follow.

Next, the cpufreq core's sysfs interface is reorganized to get rid of
the "primary CPU" concept for configurations in which the same
performance scaling settings are shared between multiple CPUs.

Finally, some interfaces that aren't necessary any more are dropped
from the generic power domains framework.

On top of the above we have some minor extensions, cleanups and bug
fixes in multiple places, as usual.

Specifics:

- ACPICA update to upstream revision 20150930 (Bob Moore, Lv Zheng).

The most significant change is to allow the AML debugger to be
built into the kernel. On top of that there is an update related
to the NFIT table (the ACPI persistent memory interface) and a few
fixes and cleanups.

- ACPI CPPC2 (Collaborative Processor Performance Control v2) support
along with a cpufreq frontend (Ashwin Chaugule).

This can only be enabled on ARM64 at this point.

- New ACPI infrastructure for the early probing of IRQ chips and
clock sources (Marc Zyngier).

- Support for a new hierarchical properties extension of the ACPI
_DSD (Device Specific Data) device configuration object allowing
the kernel to handle hierarchical properties (provided by the
platform firmware this way) automatically and make them available
to device drivers via the generic device properties interface
(Rafael Wysocki).

- Generic device properties API extension to obtain an index of
certain string value in an array of strings, along the lines of
of_property_match_string(), but working for all of the supported
firmware node types, and support for the "dma-names" device
property based on it (Mika Westerberg).

- ACPI core fix to parse the MADT (Multiple APIC Description Table)
entries in the order expected by platform firmware (and mandated by
the specification) to avoid confusion on systems with more than 255
logical CPUs (Lukasz Anaczkowski).

- Consolidation of the ACPI-based handling of PCI host bridges on x86
and ia64 (Jiang Liu).

- ACPI core fixes to ensure that the correct IRQ number is used to
represent the SCI (System Control Interrupt) in the cases when it
has been re-mapped (Chen Yu).

- New ACPI backlight quirk for Lenovo IdeaPad S405 (Hans de Goede).

- ACPI EC driver fixes (Lv Zheng).

- Assorted ACPI fixes and cleanups (Dan Carpenter, Insu Yun, Jiri
Kosina, Rami Rosen, Rasmus Villemoes).

- New mechanism in the PM core allowing drivers to check if the
platform firmware is going to be involved in the upcoming system
suspend or if it has been involved in the suspend the system is
resuming from at the moment (Rafael Wysocki).

This should allow drivers to optimize their suspend/resume handling
in some cases and the changes include a couple of users of it (the
i8042 input driver, PCI PM).

- PCI PM fix to prevent runtime-suspended devices with PME enabled
from being resumed during system suspend even if they aren't
configured to wake up the system from sleep (Rafael Wysocki).

- New mechanism to report the number of a wakeup IRQ that woke up the
system from sleep last time (Alexandra Yates).

- Removal of unused interfaces from the generic power domains
framework and fixes related to latency measurements in that code
(Ulf Hansson, Daniel Lezcano).

- cpufreq core sysfs interface rework to make it handle CPUs that
share performance scaling settings (represented by a common cpufreq
policy object) more symmetrically (Viresh Kumar).

This should help to simplify the CPU offline/online handling among
other things.

- cpufreq core fixes and cleanups (Viresh Kumar).

- intel_pstate fixes related to the Turbo Activation Ratio (TAR)
mechanism on client platforms which causes the turbo P-states range
to vary depending on platform firmware settings (Srinivas
Pandruvada).

- intel_pstate sysfs interface fix (Prarit Bhargava).

- Assorted cpufreq driver (imx, tegra20, powernv, integrator) fixes
and cleanups (Bai Ping, Bartlomiej Zolnierkiewicz, Shilpasri G
Bhat, Luis de Bethencourt).

- cpuidle mvebu driver cleanups (Russell King).

- OPP (Operating Performance Points) framework code reorganization to
make it more maintainable (Viresh Kumar).

- Intel Broxton support for the RAPL (Running Average Power Limits)
power capping driver (Amy Wiles).

- Assorted power management code fixes and cleanups (Dan Carpenter,
Geert Uytterhoeven, Geliang Tang, Luis de Bethencourt, Rasmus
Villemoes)"

* tag 'pm+acpi-4.4-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (108 commits)
cpufreq: postfix policy directory with the first CPU in related_cpus
cpufreq: create cpu/cpufreq/policyX directories
cpufreq: remove cpufreq_sysfs_{create|remove}_file()
cpufreq: create cpu/cpufreq at boot time
cpufreq: Use cpumask_copy instead of cpumask_or to copy a mask
cpufreq: ondemand: Drop unnecessary locks from update_sampling_rate()
PM / Domains: Merge measurements for PM QoS device latencies
PM / Domains: Don't measure ->start|stop() latency in system PM callbacks
PM / clk: Fix broken build due to non-matching code and header #ifdefs
ACPI / Documentation: add copy_dsdt to ACPI format options
ACPI / sysfs: correctly check failing memory allocation
ACPI / video: Add a quirk to force native backlight on Lenovo IdeaPad S405
ACPI / CPPC: Fix potential memory leak
ACPI / CPPC: signedness bug in register_pcc_channel()
ACPI / PAD: power_saving_thread() is not freezable
ACPI / PM: Fix incorrect wakeup IRQ setting during suspend-to-idle
ACPI: Using correct irq when waiting for events
ACPI: Use correct IRQ when uninstalling ACPI interrupt handler
cpuidle: mvebu: disable the bind/unbind attributes and use builtin_platform_driver
cpuidle: mvebu: clean up multiple platform drivers
...

Linus Torvalds
2015-11-05 10:10:13 +0800
e627078a0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux ... Browse Code »

Pull s390 updates from Martin Schwidefsky:
"There is only one new feature in this pull for the 4.4 merge window,
most of it is small enhancements, cleanup and bug fixes:

- Add the s390 backend for the software dirty bit tracking. This
adds two new pgtable functions pte_clear_soft_dirty and
pmd_clear_soft_dirty which is why there is a hit to
arch/x86/include/asm/pgtable.h in this pull request.

- A series of cleanup patches for the AP bus, this includes the
removal of the support for two outdated crypto cards (PCICC and
PCICA).

- The irq handling / signaling on buffer full in the runtime
instrumentation code is dropped.

- Some micro optimizations: remove unnecessary memory barriers for a
couple of functions: [smb_]rmb, [smb_]wmb, atomics, bitops, and for
spin_unlock. Use the builtin bswap if available and make
test_and_set_bit_lock more cache friendly.

- Statistics and a tracepoint for the diagnose calls to the
hypervisor.

- The CPU measurement facility support to sample KVM guests is
improved.

- The vector instructions are now always enabled for user space
processes if the hardware has the vector facility. This simplifies
the FPU handling code. The fpu-internal.h header is split into fpu
internals, api and types just like x86.

- Cleanup and improvements for the common I/O layer.

- Rework udelay to solve a problem with kprobe. udelay has busy loop
semantics but still uses an idle processor state for the wait"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (66 commits)
s390: remove runtime instrumentation interrupts
s390/cio: de-duplicate subchannel validation
s390/css: unneeded initialization in for_each_subchannel
s390/Kconfig: use builtin bswap
s390/dasd: fix disconnected device with valid path mask
s390/dasd: fix invalid PAV assignment after suspend/resume
s390/dasd: fix double free in dasd_eckd_read_conf
s390/kernel: fix ptrace peek/poke for floating point registers
s390/cio: move ccw_device_stlck functions
s390/cio: move ccw_device_call_handler
s390/topology: reduce per_cpu() invocations
s390/nmi: reduce size of percpu variable
s390/nmi: fix terminology
s390/nmi: remove casts
s390/nmi: remove pointless error strings
s390: don't store registers on disabled wait anymore
s390: get rid of __set_psw_mask()
s390/fpu: split fpu-internal.h into fpu internals, api, and type headers
s390/dasd: fix list_del corruption after lcu changes
s390/spinlock: remove unneeded serializations at unlock
...

Linus Torvalds
2015-11-05 03:31:31 +0800

04 Nov, 2015

3 commits

53528695f Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler changes from Ingo Molnar:
"The main changes in this cycle were:

- sched/fair load tracking fixes and cleanups (Byungchul Park)

- Make load tracking frequency scale invariant (Dietmar Eggemann)

- sched/deadline updates (Juri Lelli)

- stop machine fixes, cleanups and enhancements for bugs triggered by
CPU hotplug stress testing (Oleg Nesterov)

- scheduler preemption code rework: remove PREEMPT_ACTIVE and related
cleanups (Peter Zijlstra)

- Rework the sched_info::run_delay code to fix races (Peter Zijlstra)

- Optimize per entity utilization tracking (Peter Zijlstra)

- ... misc other fixes, cleanups and smaller updates"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
sched: Don't scan all-offline ->cpus_allowed twice if !CONFIG_CPUSETS
sched: Move cpu_active() tests from stop_two_cpus() into migrate_swap_stop()
sched: Start stopper early
stop_machine: Kill cpu_stop_threads->setup() and cpu_stop_unpark()
stop_machine: Kill smp_hotplug_thread->pre_unpark, introduce stop_machine_unpark()
stop_machine: Change cpu_stop_queue_two_works() to rely on stopper->enabled
stop_machine: Introduce __cpu_stop_queue_work() and cpu_stop_queue_two_works()
stop_machine: Ensure that a queued callback will be called before cpu_stop_park()
sched/x86: Fix typo in __switch_to() comments
sched/core: Remove a parameter in the migrate_task_rq() function
sched/core: Drop unlikely behind BUG_ON()
sched/core: Fix task and run queue sched_info::run_delay inconsistencies
sched/numa: Fix task_tick_fair() from disabling numa_balancing
sched/core: Add preempt_count invariant check
sched/core: More notrace annotations
sched/core: Kill PREEMPT_ACTIVE
sched/core, sched/x86: Kill thread_info::saved_preempt_count
sched/core: Simplify preempt_count tests
sched/core: Robustify preemption leak checks
sched/core: Stop setting PREEMPT_ACTIVE
...

Linus Torvalds
2015-11-04 10:03:50 +0800
105ff3cbf atomic: remove all traces of READ_ONCE_CTRL() and atomic*_read_ctrl() ... Browse Code »

This seems to be a mis-reading of how alpha memory ordering works, and
is not backed up by the alpha architecture manual. The helper functions
don't do anything special on any other architectures, and the arguments
that support them being safe on other architectures also argue that they
are safe on alpha.

Basically, the "control dependency" is between a previous read and a
subsequent write that is dependent on the value read. Even if the
subsequent write is actually done speculatively, there is no way that
such a speculative write could be made visible to other cpu's until it
has been committed, which requires validating the speculation.

Note that most weakely ordered architectures (very much including alpha)
do not guarantee any ordering relationship between two loads that depend
on each other on a control dependency:

read A
if (val == 1)
read B

because the conditional may be predicted, and the "read B" may be
speculatively moved up to before reading the value A. So we require the
user to insert a smp_rmb() between the two accesses to be correct:

read A;
if (A == 1)
smp_rmb()
read B

Alpha is further special in that it can break that ordering even if the
*address* of B depends on the read of A, because the cacheline that is
read later may be stale unless you have a memory barrier in between the
pointer read and the read of the value behind a pointer:

read ptr
read offset(ptr)

whereas all other weakly ordered architectures guarantee that the data
dependency (as opposed to just a control dependency) will order the two
accesses. As a result, alpha needs a "smp_read_barrier_depends()" in
between those two reads for them to be ordered.

The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()"
had was a control dependency to a subsequent *write*, however, and
nobody can finalize such a subsequent write without having actually done
the read. And were you to write such a value to a "stale" cacheline
(the way the unordered reads came to be), that would seem to lose the
write entirely.

So the things that make alpha able to re-order reads even more
aggressively than other weak architectures do not seem to be relevant
for a subsequent write. Alpha memory ordering may be strange, but
there's no real indication that it is *that* strange.

Also, the alpha architecture reference manual very explicitly talks
about the definition of "Dependence Constraints" in section 5.6.1.7,
where a preceding read dominates a subsequent write.

Such a dependence constraint admittedly does not impose a BEFORE (alpha
architecture term for globally visible ordering), but it does guarantee
that there can be no "causal loop". I don't see how you could avoid
such a loop if another cpu could see the stored value and then impact
the value of the first read. Put another way: the read and the write
could not be seen as being out of order wrt other cpus.

So I do not see how these "x_ctrl()" functions can currently be necessary.

I may have to eat my words at some point, but in the absense of clear
proof that alpha actually needs this, or indeed even an explanation of
how alpha could _possibly_ need it, I do not believe these functions are
called for.

And if it turns out that alpha really _does_ need a barrier for this
case, that barrier still should not be "smp_read_barrier_depends()".
We'd have to make up some new speciality barrier just for alpha, along
with the documentation for why it really is necessary.

Cc: Peter Zijlstra
Cc: Paul E McKenney
Cc: Dmitry Vyukov
Cc: Will Deacon
Cc: Ingo Molnar
Signed-off-by: Linus Torvalds

Linus Torvalds
2015-11-04 09:22:17 +0800
d63a97886 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking changes from Ingo Molnar:
"The main changes in this cycle were:

- More gradual enhancements to atomic ops: new atomic*_read_ctrl()
ops, synchronize atomic_{read,set}() ordering requirements between
architectures, add atomic_long_t bitops. (Peter Zijlstra)

- Add _{relaxed|acquire|release}() variants for inc/dec atomics and
use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
This enables weakly ordered architectures (such as arm64) to make
use of more locking related optimizations. (Davidlohr Bueso)

- Implement atomic[64]_{inc,dec}_relaxed() on ARM. (Will Deacon)

- Futex kernel data cache footprint micro-optimization. (Rasmus
Villemoes)

- pvqspinlock runtime overhead micro-optimization. (Waiman Long)

- misc smaller fixlets"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
locking/rwsem: Use acquire/release semantics
locking/mcs: Use acquire/release semantics
locking/rtmutex: Use acquire/release semantics
locking/mutex: Use acquire/release semantics
locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
atomic: Implement atomic_read_ctrl()
atomic, arch: Audit atomic_{read,set}()
atomic: Add atomic_long_t bitops
futex: Force hot variables into a single cache line
locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
locking/osq: Relax atomic semantics
locking/qrwlock: Rename ->lock to ->wait_lock
locking/Documentation/lockstat: Fix typo - lokcing -> locking
locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h

Linus Torvalds
2015-11-04 08:10:43 +0800

26 Oct, 2015

1 commit

e3ed766b4 Merge branch 'acpi-init' ... Browse Code »

* acpi-init:
clocksource: cosmetic: Drop OF 'dependency' from symbols
clocksource / arm_arch_timer: Convert to ACPI probing
clocksource: Add new CLKSRC_{PROBE,ACPI} config symbols
clocksource / ACPI: Add probing infrastructure for ACPI-based clocksources
irqchip / GIC: Convert the GIC driver to ACPI probing
irqchip / ACPI: Add probing infrastructure for ACPI-based irqchips
ACPI: Add early device probing infrastructure

Rafael J. Wysocki
2015-10-26 05:55:14 +0800

17 Oct, 2015

1 commit

bd5e88ad7 mm,thp: reduce ifdef'ery for THP in generic code ... Browse Code »

- pgtable-generic.c: Fold individual #ifdef for each helper into a top
level #ifdef. Makes code more readable

- Converted the stub helpers for !THP to BUILD_BUG() vs. runtime BUG()

Acked-by: Kirill A. Shutemov
Link: http://lkml.kernel.org/r/20151009133450.GA8597@node
Signed-off-by: Vineet Gupta

Vineet Gupta
2015-10-17 20:18:20 +0800

16 Oct, 2015

1 commit

4008cb3ad asm-generic: temporarily add back asm-generic/io-64-nonatomic*.h ... Browse Code »

New users of these files still start showing up in linux-next, so it's
better to have a migration strategy. All existing users as of 4.3-rc4
are converted to use linux/io-64-nonatomic-*.h, and after 4.4-rc1
we can change all the new ones that have come in since, and then
remove this file again.

Signed-off-by: Arnd Bergmann
Reported-by: LKP project

Arnd Bergmann
2015-10-16 18:08:53 +0800

15 Oct, 2015

3 commits

d975440bf asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations ... Browse Code »

This change is similar to e001bbae7147b111fe1aa42beaf835635f3c016e
ARM: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations

A recent change in kernel/acct.c added a new warning for many
configurations using generic __xchg() implementation:

In file included from ./arch/nios2/include/asm/cmpxchg.h:12:0,
from include/asm-generic/atomic.h:18,
from arch/nios2/include/generated/asm/atomic.h:1,
from include/linux/atomic.h:4,
from include/linux/spinlock.h:406,
from include/linux/mmzone.h:7,
from include/linux/gfp.h:5,
from include/linux/mm.h:9,
from kernel/acct.c:46:
kernel/acct.c: In function 'acct_pin_kill':
include/asm-generic/cmpxchg.h:94:3: warning: value computed is not used [-Wunused-value]
((__typeof__(*(ptr)))__cmpxchg_local_generic((ptr), (unsigned long)(o),\
^
include/asm-generic/cmpxchg.h:102:28: note: in expansion of macro 'cmpxchg_local'
#define cmpxchg(ptr, o, n) cmpxchg_local((ptr), (o), (n))
^
kernel/acct.c:177:2: note: in expansion of macro 'cmpxchg'
cmpxchg(&acct->ns->bacct, pin, NULL);
^

The code is in fact correct, it's just a cmpxchg() call that
intentionally ignores the result, and no other code does that. The
warning does not show up on x86 because of the way that its cmpxchg()
macro is written. This changes the asm-ggeneric implementation to use
a similar construct with a compound expression instead of a typecast,
which causes the compiler to not complain about an unused result.

Fix the other macros in this file in a similar way, and place them
just below their function implementations.

Signed-off-by: Marek Vasut
Cc: Russell King
Signed-off-by: Arnd Bergmann

Marek Vasut
2015-10-15 06:21:13 +0800
a1164a3ac move count_zeroes.h out of asm-generic ... Browse Code »

This header contains a few helpers currenly only used by the mpi
implementation, and not default implementation of architecture code.

Signed-off-by: Christoph Hellwig
Signed-off-by: Arnd Bergmann

Christoph Hellwig
2015-10-15 06:21:07 +0800
2f8e2c877 move io-64-nonatomic*.h out of asm-generic ... Browse Code »

These are not implementations of default architecture code but helpers
for drivers. Move them to the place they belong to.

Signed-off-by: Christoph Hellwig
Acked-by: Darren Hart
Acked-by: Hitoshi Mitake
Signed-off-by: Arnd Bergmann

Christoph Hellwig
2015-10-15 06:21:07 +0800

14 Oct, 2015

1 commit

a7b761749 mm: add architecture primitives for software dirty bit clearing ... Browse Code »

There are primitives to create and query the software dirty bits
in a pte or pmd. But the clearing of the software dirty bits is done
in common code with x86 specific page table functions.

Add the missing architecture primitives to clear the software dirty
bits to allow the feature to be used on non-x86 systems, e.g. the
s390 architecture.

Acked-by: Cyrill Gorcunov
Signed-off-by: Martin Schwidefsky

Martin Schwidefsky
2015-10-14 20:32:05 +0800

06 Oct, 2015

5 commits

00eb4bab6 locking/rwsem: Use acquire/release semantics ... Browse Code »

As of 654672d4ba1 (locking/atomics: Add _{acquire|release|relaxed}()
variants of some atomic operations) and 6d79ef2d30e (locking, asm-generic:
Add _{relaxed|acquire|release}() variants for 'atomic_long_t'), weakly
ordered archs can benefit from more relaxed use of barriers when locking
and unlocking, instead of regular full barrier semantics. While currently
only arm64 supports such optimizations, updating corresponding locking
primitives serves for other archs to immediately benefit as well, once the
necessary machinery is implemented of course.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Thomas Gleixner
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Paul E.McKenney
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443643395-17016-6-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2015-10-06 23:28:24 +0800
81a43adae locking/mutex: Use acquire/release semantics ... Browse Code »

As of 654672d4ba1 (locking/atomics: Add _{acquire|release|relaxed}()
variants of some atomic operations) and 6d79ef2d30e (locking, asm-generic:
Add _{relaxed|acquire|release}() variants for 'atomic_long_t'), weakly
ordered archs can benefit from more relaxed use of barriers when locking
and unlocking, instead of regular full barrier semantics. While currently
only arm64 supports such optimizations, updating corresponding locking
primitives serves for other archs to immediately benefit as well, once the
necessary machinery is implemented of course.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Thomas Gleixner
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Paul E.McKenney
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443643395-17016-3-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2015-10-06 23:28:20 +0800
63ab7bd0d locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics ... Browse Code »

Similar to what we have for regular add/sub calls. For now, no actual arch
implements them, so everyone falls back to the default atomics... iow,
nothing changes. These will be used in future primitives.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Thomas Gleixner
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Paul E.McKenney
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443643395-17016-2-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2015-10-06 23:28:19 +0800
82fc167c3 Merge tag 'v4.3-rc4' into locking/core, to pick up fixes before applying new changes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2015-10-06 23:10:28 +0800
609ca0663 sched/core: Create preempt_count invariant ... Browse Code »

Assuming units of PREEMPT_DISABLE_OFFSET for preempt_count() numbers.

Now that TASK_DEAD no longer results in preempt_count() == 3 during
scheduling, we will always call context_switch() with preempt_count()
== 2.

However, we don't always end up with preempt_count() == 2 in
finish_task_switch() because new tasks get created with
preempt_count() == 1.

Create FORK_PREEMPT_COUNT and set it to 2 and use that in the right
places. Note that we cannot use INIT_PREEMPT_COUNT as that serves
another purpose (boot).

After this, preempt_count() is invariant across the context switch,
with exception of PREEMPT_ACTIVE.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-10-06 23:08:14 +0800

04 Oct, 2015

1 commit

30c44659f Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile ... Browse Code »

Pull strscpy string copy function implementation from Chris Metcalf.

Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.

The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.

strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result. To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.

strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string. Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated. It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.

strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG. It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.

So why did I waffle about this for so long?

Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.

And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.

So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches. Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: use global strscpy() rather than private copy
string: provide strscpy()
Make asm/word-at-a-time.h available on all architectures

Linus Torvalds
2015-10-04 23:31:13 +0800

01 Oct, 2015

3 commits

c625f76a9 clocksource / ACPI: Add probing infrastructure for ACPI-based clocksources ... Browse Code »

DT enjoys a rather nice probing infrastructure for clocksources,
while ACPI is so far stuck into a very distant past.

This patch introduces a declarative API, allowing clocksources
to be self-contained and be called when parsing the GTDT table.

Signed-off-by: Marc Zyngier
Acked-by: Thomas Gleixner
Tested-by: Hanjun Guo
Signed-off-by: Rafael J. Wysocki

Marc Zyngier
2015-10-01 08:18:38 +0800
46e589a39 irqchip / ACPI: Add probing infrastructure for ACPI-based irqchips ... Browse Code »

DT enjoys a rather nice probing infrastructure for irqchips, while
ACPI is so far stuck into a very distant past.

This patch introduces a declarative API, allowing irqchips to be
self-contained and be called when a particular entry is matched
in the MADT table.

Signed-off-by: Marc Zyngier
Acked-by: Catalin Marinas
Reviewed-by: Hanjun Guo
Acked-by: Thomas Gleixner
Tested-by: Hanjun Guo
Signed-off-by: Rafael J. Wysocki

Marc Zyngier
2015-10-01 08:18:38 +0800
e647b5322 ACPI: Add early device probing infrastructure ... Browse Code »

IRQ controllers and timers are the two types of device the kernel
requires before being able to use the device driver model.

ACPI so far lacks a proper probing infrastructure similar to the one
we have with DT, where we're able to declare IRQ chips and
clocksources inside the driver code, and let the core code pick it up
and call us back on a match. This leads to all kind of really ugly
hacks all over the arm64 code and even in the ACPI layer.

In order to allow some basic probing based on the ACPI tables,
introduce "struct acpi_probe_entry" which contains just enough
data and callbacks to match a table, an optional subtable, and
call a probe function. A driver can, at build time, register itself
and expect being called if the right entry exists in the ACPI
table.

A acpi_probe_device_table() is provided, taking an identifier for
a set of acpi_prove_entries, and iterating over the registered
entries.

Signed-off-by: Marc Zyngier
Reviewed-by: Lorenzo Pieralisi
Reviewed-by: Hanjun Guo
Acked-by: Thomas Gleixner
Tested-by: Hanjun Guo
Signed-off-by: Rafael J. Wysocki

Marc Zyngier
2015-10-01 08:18:38 +0800

23 Sep, 2015

4 commits

e3e72ab80 atomic: Implement atomic_read_ctrl() ... Browse Code »

Provide atomic_read_ctrl() to mirror READ_ONCE_CTRL(), such that we can
more conveniently use atomics in control dependencies.

Since we can assume atomic_read() implies a READ_ONCE(), we must only
emit an extra smp_read_barrier_depends() in order to upgrade to
READ_ONCE_CTRL() semantics.

Requested-by: Dmitry Vyukov
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Will Deacon
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/20150918115637.GM3604@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-09-23 15:54:29 +0800
62e8a3258 atomic, arch: Audit atomic_{read,set}() ... Browse Code »

This patch makes sure that atomic_{read,set}() are at least
{READ,WRITE}_ONCE().

We already had the 'requirement' that atomic_read() should use
ACCESS_ONCE(), and most archs had this, but a few were lacking.
All are now converted to use READ_ONCE().

And, by a symmetry and general paranoia argument, upgrade atomic_set()
to use WRITE_ONCE().

Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Dmitry Vyukov
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: james.hogan@imgtec.com
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-09-23 15:54:28 +0800
90fe65148 atomic: Add atomic_long_t bitops ... Browse Code »

When adding the atomic bitops, I seem to have forgotten about
atomic_long_t, fix this.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-09-23 15:54:28 +0800
4bbffe718 Merge branch 'locking/urgent' into locking/core, to pick up fixes before applying new changes ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2015-09-23 15:52:03 +0800

20 Sep, 2015

1 commit

2673ee565 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm fixes from Dan Williams:

- a boot regression (since v4.2) fix for some ARM configurations from
Tyler

- regression (since v4.1) fixes for mkfs.xfs on a DAX enabled device
from Jeff. These are tagged for -stable.

- a pair of locking fixes from Axel that are hidden from lockdep since
they involve device_lock(). The "btt" one is tagged for -stable, the
other only applies to the new "pfn" mechanism in v4.3.

- a fix for the pmem ->rw_page() path to use wmb_pmem() from Ross.

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
mm: fix type cast in __pfn_to_phys()
pmem: add proper fencing to pmem_rw_page()
libnvdimm: pfn_devs: Fix locking in namespace_store
libnvdimm: btt_devs: Fix locking in namespace_store
blockdev: don't set S_DAX for misaligned partitions
dax: fix O_DIRECT I/O to the last block of a blockdev

Linus Torvalds
2015-09-20 10:13:03 +0800

19 Sep, 2015

1 commit

ae4f97696 mm: fix type cast in __pfn_to_phys() ... Browse Code »

The various definitions of __pfn_to_phys() have been consolidated to
use a generic macro in include/asm-generic/memory_model.h. This hit
mainline in the form of 012dcef3f058 "mm: move __phys_to_pfn and
__pfn_to_phys to asm/generic/memory_model.h". When the generic macro
was implemented the type cast to phys_addr_t was dropped which caused
boot regressions on ARM platforms with more than 4GB of memory and
LPAE enabled.

It was suggested to use PFN_PHYS() defined in include/linux/pfn.h
as provides the correct logic and avoids further duplication.

Reported-by: kernelci.org bot
Suggested-by: Dan Williams
Signed-off-by: Tyler Baker
Signed-off-by: Dan Williams

Tyler Baker
2015-09-19 15:58:10 +0800

18 Sep, 2015

1 commit

6e1e51969 locking/qrwlock: Rename ->lock to ->wait_lock ... Browse Code »

... trivial, but reads a little nicer when we name our
actual primitive 'lock'.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman Long
Link: http://lkml.kernel.org/r/1442216244-4409-1-git-send-email-dave@stgolabs.net
Signed-off-by: Ingo Molnar

Davidlohr Bueso
2015-09-18 15:27:29 +0800

17 Sep, 2015

1 commit

9786cff38 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking fixes from Ingo Molnar:
"Spinlock performance regression fix, plus documentation fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/static_keys: Fix up the static keys documentation
locking/qspinlock/x86: Only emit the test-and-set fallback when building guest support
locking/qspinlock/x86: Fix performance regression under unaccelerated VMs
locking/static_keys: Fix a silly typo

Linus Torvalds
2015-09-17 23:45:23 +0800

13 Sep, 2015

1 commit

c7ef92cea Merge tag 'v4.3-rc1' into locking/core, to refresh the tree ... Browse Code »

Signed-off-by: Ingo Molnar

Ingo Molnar
2015-09-13 16:01:24 +0800

11 Sep, 2015

6 commits

43b3f0289 locking/qspinlock/x86: Fix performance regression under unaccelerated VMs ... Browse Code »

Dave ran into horrible performance on a VM without PARAVIRT_SPINLOCKS
set and Linus noted that the test-and-set implementation was retarded.

One should spin on the variable with a load, not a RMW.

While there, remove 'queued' from the name, as the lock isn't queued
at all, but a simple test-and-set.

Suggested-by: Linus Torvalds
Reported-by: Dave Chinner
Tested-by: Dave Chinner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Waiman Long
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/20150904152523.GR18673@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2015-09-11 13:49:42 +0800
452e06af1 dma-mapping: consolidate dma_set_mask ... Browse Code »

Almost everyone implements dma_set_mask the same way, although some time
that's hidden in ->set_dma_mask methods.

This patch consolidates those into a common implementation that either
calls ->set_dma_mask if present or otherwise uses the default
implementation. Some architectures used to only call ->set_dma_mask
after the initial checks, and those instance have been fixed to do the
full work. h8300 implemented dma_set_mask bogusly as a no-ops and has
been fixed.

Unfortunately some architectures overload unrelated semantics like changing
the dma_ops into it so we still need to allow for an architecture override
for now.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig
Cc: Arnd Bergmann
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Yoshinori Sato
Cc: Michal Simek
Cc: Jonas Bonn
Cc: Chris Metcalf
Cc: Guan Xuetao
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Andy Shevchenko
Signed-off-by: Max Filippov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2015-09-11 04:29:01 +0800
ee196371d dma-mapping: consolidate dma_supported ... Browse Code »

Most architectures just call into ->dma_supported, but some also return 1
if the method is not present, or 0 if no dma ops are present (although
that should never happeb). Consolidate this more broad version into
common code.

Also fix h8300 which inorrectly always returned 0, which would have been
a problem if it's dma_set_mask implementation wasn't a similarly buggy
noop.

As a few architectures have much more elaborate implementations, we
still allow for arch overrides.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig
Cc: Arnd Bergmann
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Yoshinori Sato
Cc: Michal Simek
Cc: Jonas Bonn
Cc: Chris Metcalf
Cc: Guan Xuetao
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Andy Shevchenko
Signed-off-by: Max Filippov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2015-09-11 04:29:01 +0800
efa21e432 dma-mapping: cosolidate dma_mapping_error ... Browse Code »

Currently there are three valid implementations of dma_mapping_error:

(1) call ->mapping_error
(2) check for a hardcoded error code
(3) always return 0

This patch provides a common implementation that calls ->mapping_error
if present, then checks for DMA_ERROR_CODE if defined or otherwise
returns 0.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig
Cc: Arnd Bergmann
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Yoshinori Sato
Cc: Michal Simek
Cc: Jonas Bonn
Cc: Chris Metcalf
Cc: Guan Xuetao
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Andy Shevchenko
Signed-off-by: Max Filippov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2015-09-11 04:29:01 +0800
1e8937526 dma-mapping: consolidate dma_{alloc,free}_noncoherent ... Browse Code »

Most architectures do not support non-coherent allocations and either
define dma_{alloc,free}_noncoherent to their coherent versions or stub
them out.

Openrisc uses dma_{alloc,free}_attrs to implement them, and only Mips
implements them directly.

This patch moves the Openrisc version to common code, and handles the
DMA_ATTR_NON_CONSISTENT case in the mips dma_map_ops instance.

Note that actual non-coherent allocations require a dma_cache_sync
implementation, so if non-coherent allocations didn't work on
an architecture before this patch they still won't work after it.

[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig
Cc: Arnd Bergmann
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Yoshinori Sato
Cc: Michal Simek
Cc: Jonas Bonn
Cc: Chris Metcalf
Cc: Guan Xuetao
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Andy Shevchenko
Signed-off-by: Max Filippov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2015-09-11 04:29:01 +0800
6894258ed dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} ... Browse Code »

Since 2009 we have a nice asm-generic header implementing lots of DMA API
functions for architectures using struct dma_map_ops, but unfortunately
it's still missing a lot of APIs that all architectures still have to
duplicate.

This series consolidates the remaining functions, although we still need
arch opt outs for two of them as a few architectures have very
non-standard implementations.

This patch (of 5):

The coherent DMA allocator works the same over all architectures supporting
dma_map operations.

This patch consolidates them and converges the minor differences:

- the debug_dma helpers are now called from all architectures, including
those that were previously missing them
- dma_alloc_from_coherent and dma_release_from_coherent are now always
called from the generic alloc/free routines instead of the ops
dma-mapping-common.h always includes dma-coherent.h to get the defintions
for them, or the stubs if the architecture doesn't support this feature
- checks for ->alloc / ->free presence are removed. There is only one
magic instead of dma_map_ops without them (mic_dma_ops) and that one
is x86 only anyway.

Besides that only x86 needs special treatment to replace a default devices
if none is passed and tweak the gfp_flags. An optional arch hook is provided
for that.

[linux@roeck-us.net: fix build]
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: Christoph Hellwig
Cc: Arnd Bergmann
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Yoshinori Sato
Cc: Michal Simek
Cc: Jonas Bonn
Cc: Chris Metcalf
Cc: Guan Xuetao
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Andy Shevchenko
Signed-off-by: Guenter Roeck
Signed-off-by: Max Filippov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2015-09-11 04:29:01 +0800