Eric Lee / smarc-fsl-linux-kernel

13 Oct, 2015

1 commit

39d114ddc arm64: add KASAN support ... Browse Code »

This patch adds arch specific code for kernel address sanitizer
(see Documentation/kasan.txt).

1/8 of kernel addresses reserved for shadow memory. There was no
big enough hole for this, so virtual addresses for shadow were
stolen from vmalloc area.

At early boot stage the whole shadow region populated with just
one physical page (kasan_zero_page). Later, this page reused
as readonly zero shadow for some memory that KASan currently
don't track (vmalloc).
After mapping the physical memory, pages for shadow memory are
allocated and mapped.

Functions like memset/memmove/memcpy do a lot of memory accesses.
If bad pointer passed to one of these function it is important
to catch this. Compiler's instrumentation cannot do this since
these functions are written in assembly.
KASan replaces memory functions with manually instrumented variants.
Original functions declared as weak symbols so strong definitions
in mm/kasan/kasan.c could replace them. Original functions have aliases
with '__' prefix in name, so we could call non-instrumented variant
if needed.
Some files built without kasan instrumentation (e.g. mm/slub.c).
Original mem* function replaced (via #define) with prefixed variants
to disable memory access checks for such files.

Signed-off-by: Andrey Ryabinin
Tested-by: Linus Walleij
Reviewed-by: Catalin Marinas
Signed-off-by: Catalin Marinas

Andrey Ryabinin
2015-10-13 00:46:36 +0800

12 Oct, 2015

1 commit

207918461 arm64: use ENDPIPROC() to annotate position independent assembler routines ... Browse Code »

For more control over which functions are called with the MMU off or
with the UEFI 1:1 mapping active, annotate some assembler routines as
position independent. This is done by introducing ENDPIPROC(), which
replaces the ENDPROC() declaration of those routines.

Signed-off-by: Ard Biesheuvel
Signed-off-by: Catalin Marinas

Ard Biesheuvel
2015-10-12 23:19:45 +0800

07 Oct, 2015

2 commits

404268828 arm64: copy_to-from-in_user optimization using copy template ... Browse Code »

This patch optimize copy_to-from-in_user for arm 64bit architecture. The
copy template is used as template file for all the copy*.S files. Minor
change was made to it to accommodate the copy to/from/in user files.

Signed-off-by: Feng Kan
Signed-off-by: Balamurugan Shanmugam
Signed-off-by: Catalin Marinas

Feng Kan
2015-10-07 18:34:44 +0800
e5c88e3f2 arm64: Change memcpy in kernel to use the copy template file ... Browse Code »

This converts the memcpy.S to use the copy template file. The copy
template file was based originally on the memcpy.S

Signed-off-by: Feng Kan
Signed-off-by: Balamurugan Shanmugam
[catalin.marinas@arm.com: removed tmp3(w) .req statements as they are not used]
Signed-off-by: Catalin Marinas

Feng Kan
2015-10-07 18:34:43 +0800

27 Jul, 2015

5 commits

0ea366f5e arm64: atomics: prefetch the destination word for write prior to stxr ... Browse Code »

The cost of changing a cacheline from shared to exclusive state can be
significant, especially when this is triggered by an exclusive store,
since it may result in having to retry the transaction.

This patch makes use of prfm to prefetch cachelines for write prior to
ldxr/stxr loops when using the ll/sc atomic routines.

Reviewed-by: Catalin Marinas
Signed-off-by: Will Deacon

Will Deacon
2015-07-27 22:28:53 +0800
084f90372 arm64: bitops: patch in lse instructions when supported by the CPU ... Browse Code »

On CPUs which support the LSE atomic instructions introduced in ARMv8.1,
it makes sense to use them in preference to ll/sc sequences.

This patch introduces runtime patching of our bitops functions so that
LSE atomic instructions are used instead.

Reviewed-by: Steve Capper
Reviewed-by: Catalin Marinas
Signed-off-by: Will Deacon

Will Deacon
2015-07-27 22:28:51 +0800
c0385b24a arm64: introduce CONFIG_ARM64_LSE_ATOMICS as fallback to ll/sc atomics ... Browse Code »

In order to patch in the new atomic instructions at runtime, we need to
generate wrappers around the out-of-line exclusive load/store atomics.

This patch adds a new Kconfig option, CONFIG_ARM64_LSE_ATOMICS. which
causes our atomic functions to branch to the out-of-line ll/sc
implementations. To avoid the register spill overhead of the PCS, the
out-of-line functions are compiled with specific compiler flags to
force out-of-line save/restore of any registers that are usually
caller-saved.

Reviewed-by: Catalin Marinas
Signed-off-by: Will Deacon

Will Deacon
2015-07-27 22:28:50 +0800
338d4f49d arm64: kernel: Add support for Privileged Access Never ... Browse Code »

'Privileged Access Never' is a new arm8.1 feature which prevents
privileged code from accessing any virtual address where read or write
access is also permitted at EL0.

This patch enables the PAN feature on all CPUs, and modifies {get,put}_user
helpers temporarily to permit access.

This will catch kernel bugs where user memory is accessed directly.
'Unprivileged loads and stores' using ldtrb et al are unaffected by PAN.

Reviewed-by: Catalin Marinas
Signed-off-by: James Morse
[will: use ALTERNATIVE in asm and tidy up pan_enable check]
Signed-off-by: Will Deacon

James Morse
2015-07-27 18:08:41 +0800
23e949944 arm64: lib: use pair accessors for copy_*_user routines ... Browse Code »

The AArch64 instruction set contains load/store pair memory accessors,
so use these in our copy_*_user routines to transfer 16 bytes per
iteration.

Reviewed-by: Catalin Marinas
Signed-off-by: Will Deacon

Will Deacon
2015-07-27 18:08:39 +0800

13 Nov, 2014

1 commit

97fc15436 arm64: __clear_user: handle exceptions on strb ... Browse Code »

ARM64 currently doesn't fix up faults on the single-byte (strb) case of
__clear_user... which means that we can cause a nasty kernel panic as an
ordinary user with any multiple PAGE_SIZE+1 read from /dev/zero.
i.e.: dd if=/dev/zero of=foo ibs=1 count=1 (or ibs=65537, etc.)

This is a pretty obscure bug in the general case since we'll only
__do_kernel_fault (since there's no extable entry for pc) if the
mmap_sem is contended. However, with CONFIG_DEBUG_VM enabled, we'll
always fault.

if (!down_read_trylock(&mm->mmap_sem)) {
if (!user_mode(regs) && !search_exception_tables(regs->pc))
goto no_context;
retry:
down_read(&mm->mmap_sem);
} else {
/*
* The above down_read_trylock() might have succeeded in
* which
* case, we'll have missed the might_sleep() from
* down_read().
*/
might_sleep();
if (!user_mode(regs) && !search_exception_tables(regs->pc))
goto no_context;
}

Fix that by adding an extable entry for the strb instruction, since it
touches user memory, similar to the other stores in __clear_user.

Signed-off-by: Kyle McMartin
Reported-by: Miloš Prchlík
Cc: stable@vger.kernel.org
Signed-off-by: Catalin Marinas

Kyle McMartin
2014-11-13 23:21:26 +0800

23 May, 2014

6 commits

0a42cb0a6 arm64: lib: Implement optimized string length routines ... Browse Code »

This patch, based on Linaro's Cortex Strings library, adds
an assembly optimized strlen() and strnlen() functions.

Signed-off-by: Zhichang Yuan
Signed-off-by: Deepak Saxena
Signed-off-by: Catalin Marinas

zhichang.yuan
2014-05-23 22:17:12 +0800
192c4d902 arm64: lib: Implement optimized string compare routines ... Browse Code »

This patch, based on Linaro's Cortex Strings library, adds
an assembly optimized strcmp() and strncmp() functions.

Signed-off-by: Zhichang Yuan
Signed-off-by: Deepak Saxena
Signed-off-by: Catalin Marinas

zhichang.yuan
2014-05-23 22:16:59 +0800
d875c9b37 arm64: lib: Implement optimized memcmp routine ... Browse Code »

This patch, based on Linaro's Cortex Strings library, adds
an assembly optimized memcmp() function.

Signed-off-by: Zhichang Yuan
Signed-off-by: Deepak Saxena
Signed-off-by: Catalin Marinas

zhichang.yuan
2014-05-23 22:07:57 +0800
b29a51fe0 arm64: lib: Implement optimized memset routine ... Browse Code »

This patch, based on Linaro's Cortex Strings library, improves
the performance of the assembly optimized memset() function.

Signed-off-by: Zhichang Yuan
Signed-off-by: Deepak Saxena
Signed-off-by: Catalin Marinas

zhichang.yuan
2014-05-23 22:07:48 +0800
280adc195 arm64: lib: Implement optimized memmove routine ... Browse Code »

This patch, based on Linaro's Cortex Strings library, improves
the performance of the assembly optimized memmove() function.

Signed-off-by: Zhichang Yuan
Signed-off-by: Deepak Saxena
Signed-off-by: Catalin Marinas

zhichang.yuan
2014-05-23 22:07:35 +0800
808dbac6b arm64: lib: Implement optimized memcpy routine ... Browse Code »

This patch, based on Linaro's Cortex Strings library, improves
the performance of the assembly optimized memcpy() function.

Signed-off-by: Zhichang Yuan
Signed-off-by: Deepak Saxena
Signed-off-by: Catalin Marinas

zhichang.yuan
2014-05-23 22:06:53 +0800

08 Feb, 2014

1 commit

8e86f0b40 arm64: atomics: fix use of acquire + release for full barrier semantics ... Browse Code »

Linux requires a number of atomic operations to provide full barrier
semantics, that is no memory accesses after the operation can be
observed before any accesses up to and including the operation in
program order.

On arm64, these operations have been incorrectly implemented as follows:

// A, B, C are independent memory locations

// atomic_op (B)
1: ldaxr x0, [B] // Exclusive load with acquire

stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b

The assumption here being that two half barriers are equivalent to a
full barrier, so the only permitted ordering would be A -> B -> C
(where B is the atomic operation involving both a load and a store).

Unfortunately, this is not the case by the letter of the architecture
and, in fact, the accesses to A and C are permitted to pass their
nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs
or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the
store-release on B). This is a clear violation of the full barrier
requirement.

The simple way to fix this is to implement the same algorithm as ARMv7
using explicit barriers:

// atomic_op (B)
dmb ish // Full barrier
1: ldxr x0, [B] // Exclusive load

stxr w1, x0, [B] // Exclusive store
cbnz w1, 1b
dmb ish // Full barrier

but this has the undesirable effect of introducing *two* full barrier
instructions. A better approach is actually the following, non-intuitive
sequence:

// atomic_op (B)
1: ldxr x0, [B] // Exclusive load

stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
dmb ish // Full barrier

The simple observations here are:

- The dmb ensures that no subsequent accesses (e.g. the access to C)
can enter or pass the atomic sequence.

- The dmb also ensures that no prior accesses (e.g. the access to A)
can pass the atomic sequence.

- Therefore, no prior access can pass a subsequent access, or
vice-versa (i.e. A is strictly ordered before C).

- The stlxr ensures that no prior access can pass the store component
of the atomic operation.

The only tricky part remaining is the ordering between the ldxr and the
access to A, since the absence of the first dmb means that we're now
permitting re-ordering between the ldxr and any prior accesses.

From an (arbitrary) observer's point of view, there are two scenarios:

1. We have observed the ldxr. This means that if we perform a store to
[B], the ldxr will still return older data. If we can observe the
ldxr, then we can potentially observe the permitted re-ordering
with the access to A, which is clearly an issue when compared to
the dmb variant of the code. Thankfully, the exclusive monitor will
save us here since it will be cleared as a result of the store and
the ldxr will retry. Notice that any use of a later memory
observation to imply observation of the ldxr will also imply
observation of the access to A, since the stlxr/dmb ensure strict
ordering.

2. We have not observed the ldxr. This means we can perform a store
and influence the later ldxr. However, that doesn't actually tell
us anything about the access to [A], so we've not lost anything
here either when compared to the dmb variant.

This patch implements this solution for our barriered atomic operations,
ensuring that we satisfy the full barrier requirements where they are
needed.

Cc:
Cc: Peter Zijlstra
Signed-off-by: Will Deacon
Signed-off-by: Catalin Marinas

Will Deacon
2014-02-08 00:45:43 +0800

20 Dec, 2013

1 commit

12a0ef7b0 arm64: use generic strnlen_user and strncpy_from_user functions ... Browse Code »

This patch implements the word-at-a-time interface for arm64 using the
same algorithm as ARM. We use the fls64 macro, which expands to a clz
instruction via a compiler builtin. Big-endian configurations make use
of the implementation from asm-generic.

With this implemented, we can replace our byte-at-a-time strnlen_user
and strncpy_from_user functions with the optimised generic versions.

Signed-off-by: Will Deacon
Signed-off-by: Catalin Marinas

Will Deacon
2013-12-20 01:43:06 +0800

08 May, 2013

1 commit

420c158dc arm64: Treat the bitops index argument as an 'int' ... Browse Code »

The bitops prototype use an 'int' as the bit index type but the asm
implementation assume it to be a 'long'. Since the compiler does not
guarantee zeroing the upper 32-bits in a register when used as 'int',
change the bitops implementation accordingly.

Signed-off-by: Catalin Marinas

Catalin Marinas
2013-05-08 17:33:17 +0800

30 Apr, 2013

2 commits

16c85a1fd arm64: Use acquire/release semantics instead of explicit DMB ... Browse Code »

This patch changes the test_and_*_bit functions to use the
load-acquire/store-release instructions instead of explicit DMB.

Signed-off-by: Catalin Marinas

Catalin Marinas
2013-04-30 22:58:37 +0800
c47d6a04e arm64: klib: bitops: fix unpredictable stxr usage ... Browse Code »

We're currently relying on unpredictable behaviour in our testops
(test_and_*_bit), as stxr is unpredictable when the status register and
the source register are the same

This patch changes reallocates the status register so as to bring us back into
the realm of predictable behaviour. Boot tested on an AEMv8 model.

Signed-off-by: Mark Rutland
Signed-off-by: Catalin Marinas

Mark Rutland
2013-04-30 22:53:01 +0800

22 Mar, 2013

3 commits

624795865 arm64: klib: Optimised atomic bitops ... Browse Code »

This patch implements the AArch64-specific atomic bitops functions using
exclusive memory accesses to avoid locking.

Signed-off-by: Catalin Marinas

Catalin Marinas
2013-03-22 01:39:31 +0800
2b8cac814 arm64: klib: Optimised string functions ... Browse Code »

This patch introduces AArch64-specific string functions (strchr,
strrchr).

Signed-off-by: Catalin Marinas

Catalin Marinas
2013-03-22 01:39:30 +0800
4a8992271 arm64: klib: Optimised memory functions ... Browse Code »

This patch introduces AArch64-specific memory functions (memcpy,
memmove, memchr, memset). These functions are not optimised for any CPU
implementation but can be used as a starting point once hardware is
available.

Signed-off-by: Catalin Marinas

Catalin Marinas
2013-03-22 01:39:29 +0800

17 Sep, 2012

2 commits

f27bb139c arm64: Miscellaneous library functions ... Browse Code »

This patch adds udelay, memory and bit operations together with the
ksyms exports.

Signed-off-by: Marc Zyngier
Signed-off-by: Will Deacon
Signed-off-by: Catalin Marinas
Acked-by: Tony Lindgren
Acked-by: Nicolas Pitre
Acked-by: Olof Johansson
Acked-by: Santosh Shilimkar

Marc Zyngier
2012-09-17 20:42:18 +0800
0aea86a21 arm64: User access library functions ... Browse Code »

This patch add support for various user access functions. These
functions use the standard LDR/STR instructions and not the LDRT/STRT
variants in order to allow kernel addresses (after set_fs(KERNEL_DS)).

Signed-off-by: Will Deacon
Signed-off-by: Marc Zyngier
Signed-off-by: Catalin Marinas
Acked-by: Tony Lindgren
Acked-by: Nicolas Pitre
Acked-by: Olof Johansson
Acked-by: Santosh Shilimkar
Acked-by: Arnd Bergmann

Catalin Marinas
2012-09-17 20:42:11 +0800