Eric Lee / smarc-fsl-linux-kernel

08 Aug, 2019

1 commit

04f5bda84 x86/lib/cpu: Address missing prototypes warning ... Browse Code »

When building with W=1, warnings about missing prototypes are emitted:

CC arch/x86/lib/cpu.o
arch/x86/lib/cpu.c:5:14: warning: no previous prototype for 'x86_family' [-Wmissing-prototypes]
5 | unsigned int x86_family(unsigned int sig)
| ^~~~~~~~~~
arch/x86/lib/cpu.c:18:14: warning: no previous prototype for 'x86_model' [-Wmissing-prototypes]
18 | unsigned int x86_model(unsigned int sig)
| ^~~~~~~~~
arch/x86/lib/cpu.c:33:14: warning: no previous prototype for 'x86_stepping' [-Wmissing-prototypes]
33 | unsigned int x86_stepping(unsigned int sig)
| ^~~~~~~~~~~~

Add the proper include file so the prototypes are there.

Signed-off-by: Valdis Kletnieks
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/42513.1565234837@turing-police

Valdis Klētnieks
2019-08-08 14:25:53 +0800

19 Jul, 2019

3 commits

82e844a65 x86/uaccess: Remove redundant CLACs in getuser/putuser error paths ... Browse Code »

The same getuser/putuser error paths are used regardless of whether AC
is set. In non-exception failure cases, this results in an unnecessary
CLAC.

Fixes the following warnings:

arch/x86/lib/getuser.o: warning: objtool: .altinstr_replacement+0x18: redundant UACCESS disable
arch/x86/lib/putuser.o: warning: objtool: .altinstr_replacement+0x18: redundant UACCESS disable

Signed-off-by: Josh Poimboeuf
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/bc14ded2755ae75bd9010c446079e113dbddb74b.1563413318.git.jpoimboe@redhat.com

Josh Poimboeuf
2019-07-19 03:01:06 +0800
5e307a6bc x86/uaccess: Don't leak AC flag into fentry from mcsafe_handle_tail() ... Browse Code »

After adding mcsafe_handle_tail() to the objtool uaccess safe list,
objtool reports:

arch/x86/lib/usercopy_64.o: warning: objtool: mcsafe_handle_tail()+0x0: call to __fentry__() with UACCESS enabled

With SMAP, this function is called with AC=1, so it needs to be careful
about which functions it calls. Disable the ftrace entry hook, which
can potentially pull in a lot of extra code.

Signed-off-by: Josh Poimboeuf
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/8e13d6f0da1c8a3f7603903da6cbf6d582bbfe10.1563413318.git.jpoimboe@redhat.com

Josh Poimboeuf
2019-07-19 03:01:05 +0800
3a6ab4bcc x86/uaccess: Remove ELF function annotation from copy_user_handle_tail() ... Browse Code »

After an objtool improvement, it's complaining about the CLAC in
copy_user_handle_tail():

arch/x86/lib/copy_user_64.o: warning: objtool: .altinstr_replacement+0x12: redundant UACCESS disable
arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x6: (alt)
arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x2: (alt)
arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x0:
Signed-off-by: Thomas Gleixner
Acked-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/6b6e436774678b4b9873811ff023bd29935bee5b.1563413318.git.jpoimboe@redhat.com

Josh Poimboeuf
2019-07-19 03:01:05 +0800

09 Jul, 2019

1 commit

e0e86b111 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull SMP/hotplug updates from Thomas Gleixner:
"A small set of updates for SMP and CPU hotplug:

- Abort disabling secondary CPUs in the freezer when a wakeup is
pending instead of evaluating it only after all CPUs have been
offlined.

- Remove the shared annotation for the strict per CPU cfd_data in the
smp function call core code.

- Remove the return values of smp_call_function() and on_each_cpu()
as they are unconditionally 0. Fixup the few callers which actually
bothered to check the return value"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp: Remove smp_call_function() and on_each_cpu() return values
smp: Do not mark call_function_data as shared
cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending
cpu/hotplug: Fix notify_cpu_starting() reference in bringup_wait_for_ap()

Linus Torvalds
2019-07-09 01:39:56 +0800

23 Jun, 2019

1 commit

caa759323 smp: Remove smp_call_function() and on_each_cpu() return values ... Browse Code »

The return value is fixed. Remove it and amend the callers.

[ tglx: Fixup arm/bL_switcher and powerpc/rtas ]

Signed-off-by: Nadav Amit
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Dave Hansen
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Cc: Tony Luck
Cc: Fenghua Yu
Cc: Andrew Morton
Link: https://lkml.kernel.org/r/20190613064813.8102-2-namit@vmware.com

Nadav Amit
2019-06-23 20:26:26 +0800

19 Jun, 2019

2 commits

775c8a3d7 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 504 ... Browse Code »

Based on 1 normalized pattern(s):

this file is free software you can redistribute it and or modify it
under the terms of version 2 of the gnu general public license as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to the free
software foundation inc 51 franklin st fifth floor boston ma 02110
1301 usa

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 8 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Enrico Weigelt
Reviewed-by: Kate Stewart
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081207.443595178@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-19 23:09:56 +0800
97873a3da treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 497 ... Browse Code »

Based on 1 normalized pattern(s):

this file is part of the linux kernel and is made available under
the terms of the gnu general public license version 2

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 28 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Enrico Weigelt
Reviewed-by: Allison Randal
Reviewed-by: Kate Stewart
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.534229504@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-19 23:09:53 +0800

09 Jun, 2019

1 commit

9331b6740 Merge tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull yet more SPDX updates from Greg KH:
"Another round of SPDX header file fixes for 5.2-rc4

These are all more "GPL-2.0-or-later" or "GPL-2.0-only" tags being
added, based on the text in the files. We are slowly chipping away at
the 700+ different ways people tried to write the license text. All of
these were reviewed on the spdx mailing list by a number of different
people.

We now have over 60% of the kernel files covered with SPDX tags:
$ ./scripts/spdxcheck.py -v 2>&1 | grep Files
Files checked: 64533
Files with SPDX: 40392
Files with errors: 0

I think the majority of the "easy" fixups are now done, it's now the
start of the longer-tail of crazy variants to wade through"

* tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (159 commits)
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 450
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 449
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 448
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 446
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 445
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 444
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 443
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 442
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 440
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 438
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 435
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 434
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 432
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 431
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 429
...

Linus Torvalds
2019-06-09 03:52:42 +0800

08 Jun, 2019

1 commit

de9f86961 x86/insn-eval: Fix use-after-free access to LDT entry ... Browse Code »

get_desc() computes a pointer into the LDT while holding a lock that
protects the LDT from being freed, but then drops the lock and returns the
(now potentially dangling) pointer to its caller.

Fix it by giving the caller a copy of the LDT entry instead.

Fixes: 670f928ba09b ("x86/insn-eval: Add utility function to get segment descriptor")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn
Signed-off-by: Linus Torvalds

Jann Horn
2019-06-08 02:11:06 +0800

05 Jun, 2019

1 commit

b886d83c5 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the license

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 315 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Armijn Hemel
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-05 23:37:17 +0800

31 May, 2019

4 commits

3fc217511 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 224 ... Browse Code »

Based on 1 normalized pattern(s):

subject to the gnu public license v2

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 1 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Alexios Zavras
Reviewed-by: Allison Randal
Reviewed-by: Steve Winslow
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171440.222651153@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:29:55 +0800
7e300dabb treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 223 ... Browse Code »

Based on 1 normalized pattern(s):

subject to the gnu public license v 2

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 9 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Alexios Zavras
Reviewed-by: Steve Winslow
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528171440.130801526@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:29:55 +0800
1a59d1b8e treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not write to the free software foundation inc
59 temple place suite 330 boston ma 02111 1307 usa

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 1334 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Richard Fontana
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:35 +0800
2874c5fd2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:32 +0800

21 May, 2019

1 commit

457c89965 treewide: Add SPDX license identifier for missed files ... Browse Code »

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-21 16:50:45 +0800

07 May, 2019

4 commits

db10ad041 Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 timer updates from Ingo Molnar:
"Two changes: an LTO improvement, plus the new 'nowatchdog' boot option
to disable the clocksource watchdog"

* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/timer: Don't inline __const_udelay()
x86/tsc: Add option to disable tsc clocksource watchdog

Linus Torvalds
2019-05-07 07:31:44 +0800
f725492dd Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 asm updates from Ingo Molnar:
"This includes the following changes:

- cpu_has() cleanups

- sync_bitops.h modernization to the rmwcc.h facility, similarly to
bitops.h

- continued LTO annotations/fixes

- misc cleanups and smaller cleanups"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/um/vdso: Drop unnecessary cc-ldoption
x86/vdso: Rename variable to fix -Wshadow warning
x86/cpu/amd: Exclude 32bit only assembler from 64bit build
x86/asm: Mark all top level asm statements as .text
x86/build/vdso: Add FORCE to the build rule of %.so
x86/asm: Modernize sync_bitops.h
x86/mm: Convert some slow-path static_cpu_has() callers to boot_cpu_has()
x86: Convert some slow-path static_cpu_has() callers to boot_cpu_has()
x86/asm: Clarify static_cpu_has()'s intended use
x86/uaccess: Fix implicit cast of __user pointer
x86/cpufeature: Remove __pure attribute to _static_cpu_has()

Linus Torvalds
2019-05-07 06:32:35 +0800
007dc78fe Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull locking updates from Ingo Molnar:
"Here are the locking changes in this cycle:

- rwsem unification and simpler micro-optimizations to prepare for
more intrusive (and more lucrative) scalability improvements in
v5.3 (Waiman Long)

- Lockdep irq state tracking flag usage cleanups (Frederic
Weisbecker)

- static key improvements (Jakub Kicinski, Peter Zijlstra)

- misc updates, cleanups and smaller fixes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
locking/lockdep: Remove unnecessary unlikely()
locking/static_key: Don't take sleeping locks in __static_key_slow_dec_deferred()
locking/static_key: Factor out the fast path of static_key_slow_dec()
locking/static_key: Add support for deferred static branches
locking/lockdep: Test all incompatible scenarios at once in check_irq_usage()
locking/lockdep: Avoid bogus Clang warning
locking/lockdep: Generate LOCKF_ bit composites
locking/lockdep: Use expanded masks on find_usage_*() functions
locking/lockdep: Map remaining magic numbers to lock usage mask names
locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
locking/rwsem: Prevent unneeded warning during locking selftest
locking/rwsem: Optimize rwsem structure for uncontended lock acquisition
locking/rwsem: Enable lock event counting
locking/lock_events: Don't show pvqspinlock events on bare metal
locking/lock_events: Make lock_events available for all archs & other locks
locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs
locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro
locking/rwsem: Add debug check for __down_read*()
locking/rwsem: Micro-optimize rwsem_try_read_lock_unqueued()
locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h
...

Linus Torvalds
2019-05-07 04:50:15 +0800
6ec62961e Merge branch 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull objtool updates from Ingo Molnar:
"This is a series from Peter Zijlstra that adds x86 build-time uaccess
validation of SMAP to objtool, which will detect and warn about the
following uaccess API usage bugs and weirdnesses:

- call to %s() with UACCESS enabled
- return with UACCESS enabled
- return with UACCESS disabled from a UACCESS-safe function
- recursive UACCESS enable
- redundant UACCESS disable
- UACCESS-safe disables UACCESS

As it turns out not leaking uaccess permissions outside the intended
uaccess functionality is hard when the interfaces are complex and when
such bugs are mostly dormant.

As a bonus we now also check the DF flag. We had at least one
high-profile bug in that area in the early days of Linux, and the
checking is fairly simple. The checks performed and warnings emitted
are:

- call to %s() with DF set
- return with DF set
- return with modified stack frame
- recursive STD
- redundant CLD

It's all x86-only for now, but later on this can also be used for PAN
on ARM and objtool is fairly cross-platform in principle.

While all warnings emitted by this new checking facility that got
reported to us were fixed, there might be GCC version dependent
warnings that were not reported yet - which we'll address, should they
trigger.

The warnings are non-fatal build warnings"

* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions
x86/uaccess: Dont leak the AC flag into __put_user() argument evaluation
sched/x86_64: Don't save flags on context switch
objtool: Add Direction Flag validation
objtool: Add UACCESS validation
objtool: Fix sibling call detection
objtool: Rewrite alt->skip_orig
objtool: Add --backtrace support
objtool: Rewrite add_ignores()
objtool: Handle function aliases
objtool: Set insn->func for alternatives
x86/uaccess, kcov: Disable stack protector
x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP
x86/uaccess, ubsan: Fix UBSAN vs. SMAP
x86/uaccess, kasan: Fix KASAN vs SMAP
x86/smap: Ditch __stringify()
x86/uaccess: Introduce user_access_{save,restore}()
x86/uaccess, signal: Fix AC=1 bloat
x86/uaccess: Always inline user_access_begin()
x86/uaccess, xen: Suppress SMAP warnings
...

Linus Torvalds
2019-05-07 02:39:17 +0800

30 Apr, 2019

1 commit

b51ce3744 x86/mm/mem_encrypt: Disable all instrumentation for early SME setup ... Browse Code »

Enablement of AMD's Secure Memory Encryption feature is determined very
early after start_kernel() is entered. Part of this procedure involves
scanning the command line for the parameter 'mem_encrypt'.

To determine intended state, the function sme_enable() uses library
functions cmdline_find_option() and strncmp(). Their use occurs early
enough such that it cannot be assumed that any instrumentation subsystem
is initialized.

For example, making calls to a KASAN-instrumented function before KASAN
is set up will result in the use of uninitialized memory and a boot
failure.

When AMD's SME support is enabled, conditionally disable instrumentation
of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.

[ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]

Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
Reported-by: Li RongQing
Signed-off-by: Gary R Hook
Signed-off-by: Borislav Petkov
Cc: Alexander Shishkin
Cc: Andrew Morton
Cc: Andy Shevchenko
Cc: Boris Brezillon
Cc: Coly Li
Cc: "dave.hansen@linux.intel.com"
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Kees Cook
Cc: Kent Overstreet
Cc: "luto@kernel.org"
Cc: Masahiro Yamada
Cc: Matthew Wilcox
Cc: "mingo@redhat.com"
Cc: "peterz@infradead.org"
Cc: Sebastian Andrzej Siewior
Cc: Thomas Gleixner
Cc: x86-ml
Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.com

Gary Hook
2019-04-30 23:59:08 +0800

19 Apr, 2019

2 commits

81423c374 x86/timer: Don't inline __const_udelay() ... Browse Code »

LTO will happily inline __const_udelay() everywhere it is used. Forcing it
noinline saves ~44k text in a LTO build.

13999560 1740864 1499136 17239560 1070e08 vmlinux-with-udelay-inline
13954764 1736768 1499136 17190668 1064f0c vmlinux-wo-udelay-inline

Even without LTO this function should never be inlined.

Signed-off-by: Andi Kleen
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/20190330004743.29541-4-andi@firstfloor.org

Andi Kleen
2019-04-19 23:49:47 +0800
c03e27506 x86/asm: Mark all top level asm statements as .text ... Browse Code »

With gcc toplevel assembler statements that do not mark themselves as .text
may end up in other sections. This causes LTO boot crashes because various
assembler statements ended up in the middle of the initcall section. It's
also a latent problem without LTO, although it's currently not known to
cause any real problems.

According to the gcc team it's expected behavior.

Always mark all the top level assembler statements as text so that they
switch to the right section.

Signed-off-by: Andi Kleen
Signed-off-by: Thomas Gleixner
Link: https://lkml.kernel.org/r/20190330004743.29541-1-andi@firstfloor.org

Andi Kleen
2019-04-19 23:46:55 +0800

03 Apr, 2019

3 commits

46ad0840b locking/rwsem: Remove arch specific rwsem files ... Browse Code »

As the generic rwsem-xadd code is using the appropriate acquire and
release versions of the atomic operations, the arch specific rwsem.h
files will not be that much faster than the generic code as long as the
atomic functions are properly implemented. So we can remove those arch
specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
effort.

Currently, only x86, alpha and ia64 have implemented architecture
specific fast paths. I don't have access to alpha and ia64 systems for
testing, but they are legacy systems that are not likely to be updated
to the latest kernel anyway.

By using a rwsem microbenchmark, the total locking rates on a 4-socket
56-core 112-thread x86-64 system before and after the patch were as
follows (mixed means equal # of read and write locks):

Before Patch After Patch
# of Threads wlock rlock mixed wlock rlock mixed
------------ ----- ----- ----- ----- ----- -----
1 29,201 30,143 29,458 28,615 30,172 29,201
2 6,807 13,299 1,171 7,725 15,025 1,804
4 6,504 12,755 1,520 7,127 14,286 1,345
8 6,762 13,412 764 6,826 13,652 726
16 6,693 15,408 662 6,599 15,938 626
32 6,145 15,286 496 5,549 15,487 511
64 5,812 15,495 60 5,858 15,572 60

There were some run-to-run variations for the multi-thread tests. For
x86-64, using the generic C code fast path seems to be a little bit
faster than the assembly version with low lock contention. Looking at
the assembly version of the fast paths, there are assembly to/from C
code wrappers that save and restore all the callee-clobbered registers
(7 registers on x86-64). The assembly generated from the generic C
code doesn't need to do that. That may explain the slight performance
gain here.

The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
with no code change as no other code other than those under
kernel/locking needs to access the internal rwsem macros and functions.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Linus Torvalds
Cc: Andrew Morton
Cc: Arnd Bergmann
Cc: Borislav Petkov
Cc: Davidlohr Bueso
Cc: H. Peter Anvin
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Cc: Will Deacon
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.com
Signed-off-by: Ingo Molnar

Waiman Long
2019-04-03 20:50:50 +0800
b69656fa7 x86/uaccess: Fix up the fixup ... Browse Code »

New tooling got confused about this:

arch/x86/lib/memcpy_64.o: warning: objtool: .fixup+0x7: return with UACCESS enabled

While the code isn't wrong, it is tedious (if at all possible) to
figure out what function a particular chunk of .fixup belongs to.

This then confuses the objtool uaccess validation. Instead of
returning directly from the .fixup, jump back into the right function.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Josh Poimboeuf
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Peter Zijlstra
2019-04-03 15:39:45 +0800
3693ca811 x86/uaccess: Move copy_user_handle_tail() into asm ... Browse Code »

By writing the function in asm we avoid cross object code flow and
objtool no longer gets confused about a 'stray' CLAC.

Also; the asm version is actually _simpler_.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: Josh Poimboeuf
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Peter Zijlstra
2019-04-03 15:36:29 +0800

21 Mar, 2019

1 commit

725e29db8 x86/lib: Fix indentation issue, remove extra tab ... Browse Code »

The increment of buff is indented one level too deeply, clean
this up by removing a tab.

Signed-off-by: Colin Ian King
Signed-off-by: Thomas Gleixner
Cc: Borislav Petkov
Cc: "H . Peter Anvin"
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20190314230838.18256-1-colin.king@canonical.com

Colin Ian King
2019-03-21 19:24:38 +0800

08 Mar, 2019

1 commit

bcd49c3dd Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 cleanups from Ingo Molnar:
"Various cleanups and simplifications, none of them really stands out,
they are all over the place"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/uaccess: Remove unused __addr_ok() macro
x86/smpboot: Remove unused phys_id variable
x86/mm/dump_pagetables: Remove the unused prev_pud variable
x86/fpu: Move init_xstate_size() to __init section
x86/cpu_entry_area: Move percpu_setup_debug_store() to __init section
x86/mtrr: Remove unused variable
x86/boot/compressed/64: Explain paging_prepare()'s return value
x86/resctrl: Remove duplicate MSR_MISC_FEATURE_CONTROL definition
x86/asm/suspend: Drop ENTRY from local data
x86/hw_breakpoints, kprobes: Remove kprobes ifdeffery
x86/boot: Save several bytes in decompressor
x86/trap: Remove useless declaration
x86/mm/tlb: Remove unused cpu variable
x86/events: Mark expected switch-case fall-throughs
x86/asm-prototypes: Remove duplicate include
x86/kernel: Mark expected switch-case fall-throughs
x86/insn-eval: Mark expected switch-case fall-through
x86/platform/UV: Replace kmalloc() and memset() with k[cz]alloc() calls
x86/e820: Replace kmalloc() + memcpy() with kmemdup()

Linus Torvalds
2019-03-08 08:36:57 +0800

06 Mar, 2019

1 commit

bc8ff3ca6 docs/core-api/mm: fix user memory accessors formatting ... Browse Code »

The descriptions of userspace memory access functions had minor issues
with formatting that made kernel-doc unable to properly detect the
function/macro names and the return value sections:

./arch/x86/include/asm/uaccess.h:80: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:139: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:231: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:505: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:530: info: Scanning doc for
./arch/x86/lib/usercopy_32.c:58: info: Scanning doc for
./arch/x86/lib/usercopy_32.c:69: warning: No description found for return
value of 'clear_user'
./arch/x86/lib/usercopy_32.c:78: info: Scanning doc for
./arch/x86/lib/usercopy_32.c:90: warning: No description found for return
value of '__clear_user'

Fix the formatting.

Link: http://lkml.kernel.org/r/1549549644-4903-3-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Reviewed-by: Andrew Morton
Cc: Jonathan Corbet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Rapoport
2019-03-06 13:07:20 +0800

02 Feb, 2019

1 commit

c228d294f x86: explicitly align IO accesses in memcpy_{to,from}io ... Browse Code »

In commit 170d13ca3a2f ("x86: re-introduce non-generic memcpy_{to,from}io")
I made our copy from IO space use a separate copy routine rather than
rely on the generic memcpy. I did that because our generic memory copy
isn't actually well-defined when it comes to internal access ordering or
alignment, and will in fact depend on various CPUID flags.

In particular, the default memcpy() for a modern Intel CPU will
generally be just a "rep movsb", which works reasonably well for
medium-sized memory copies of regular RAM, since the CPU will turn it
into fairly optimized microcode.

However, for non-cached memory and IO, "rep movs" ends up being
horrendously slow and will just do the architectural "one byte at a
time" accesses implied by the movsb.

At the other end of the spectrum, if you _don't_ end up using the "rep
movsb" code, you'd likely fall back to the software copy, which does
overlapping accesses for the tail, and may copy things backwards.
Again, for regular memory that's fine, for IO memory not so much.

The thinking was that clearly nobody really cared (because things
worked), but some people had seen horrible performance due to the byte
accesses, so let's just revert back to our long ago version that dod
"rep movsl" for the bulk of the copy, and then fixed up the potentially
last few bytes of the tail with "movsw/b".

Interestingly (and perhaps not entirely surprisingly), while that was
our original memory copy implementation, and had been used before for
IO, in the meantime many new users of memcpy_*io() had come about. And
while the access patterns for the memory copy weren't well-defined (so
arguably _any_ access pattern should work), in practice the "rep movsb"
case had been very common for the last several years.

In particular Jarkko Sakkinen reported that the memcpy_*io() change
resuled in weird errors from his Geminilake NUC TPM module.

And it turns out that the TPM TCG accesses according to spec require
that the accesses be

(a) done strictly sequentially

(b) be naturally aligned

otherwise the TPM chip will abort the PCI transaction.

And, in fact, the tpm_crb.c driver did this:

memcpy_fromio(buf, priv->rsp, 6);
...
memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);

which really should never have worked in the first place, but back
before commit 170d13ca3a2f it *happened* to work, because the
memcpy_fromio() would be expanded to a regular memcpy, and

(a) gcc would expand the first memcpy in-line, and turn it into a
4-byte and a 2-byte read, and they happened to be in the right
order, and the alignment was right.

(b) gcc would call "memcpy()" for the second one, and the machines that
had this TPM chip also apparently ended up always having ERMS
("Enhanced REP MOVSB/STOSB instructions"), so we'd use the "rep
movbs" for that copy.

In other words, basically by pure luck, the code happened to use the
right access sizes in the (two different!) memcpy() implementations to
make it all work.

But after commit 170d13ca3a2f, both of the memcpy_fromio() calls
resulted in a call to the routine with the consistent memory accesses,
and in both cases it started out transferring with 4-byte accesses.
Which worked for the first copy, but resulted in the second copy doing a
32-bit read at an address that was only 2-byte aligned.

Jarkko is actually fixing the fragile code in the TPM driver, but since
this is an excellent example of why we absolutely must not use a generic
memcpy for IO accesses, _and_ an IO-specific one really should strive to
align the IO accesses, let's do exactly that.

Side note: Jarkko also noted that the driver had been used on ARM
platforms, and had worked. That was because on 32-bit ARM, memcpy_*io()
ends up always doing byte accesses, and on 64-bit ARM it first does byte
accesses to align to 8-byte boundaries, and then does 8-byte accesses
for the bulk.

So ARM actually worked by design, and the x86 case worked by pure luck.

We *might* want to make x86-64 do the 8-byte case too. That should be a
pretty straightforward extension, but let's do one thing at a time. And
generally MMIO accesses aren't really all that performance-critical, as
shown by the fact that for a long time we just did them a byte at a
time, and very few people ever noticed.

Reported-and-tested-by: Jarkko Sakkinen
Tested-by: Jerry Snitselaar
Cc: David Laight
Fixes: 170d13ca3a2f ("x86: re-introduce non-generic memcpy_{to,from}io")
Signed-off-by: Linus Torvalds

Linus Torvalds
2019-02-02 01:07:48 +0800

26 Jan, 2019

1 commit

89da34462 x86/insn-eval: Mark expected switch-case fall-through ... Browse Code »

In preparation to enable -Wimplicit-fallthrough by default, mark
switch-case statements where fall-through is intentional, explicitly.

Thus fix the following warning:

arch/x86/lib/insn-eval.c: In function ‘resolve_default_seg’:
arch/x86/lib/insn-eval.c:179:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (insn->addr_bytes == 2)
^
arch/x86/lib/insn-eval.c:182:2: note: here
case -EDOM:
^~~~

Warning level 3 was used: -Wimplicit-fallthrough=3

This is part of the ongoing efforts to enable -Wimplicit-fallthrough by
default.

[ bp: Massage commit message. ]

Signed-off-by: Gustavo A. R. Silva
Signed-off-by: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Kees Cook
Cc: Thomas Gleixner
Cc: x86-ml
Link: https://lkml.kernel.org/r/20190125205520.GA9602@embeddedor

Gustavo A. R. Silva
2019-01-26 17:46:42 +0800

12 Jan, 2019

1 commit

7e6fc2f50 x86/kaslr: Fix incorrect i8254 outb() parameters ... Browse Code »

The outb() function takes parameters value and port, in that order. Fix
the parameters used in the kalsr i8254 fallback code.

Fixes: 5bfce5ef55cb ("x86, kaslr: Provide randomness functions")
Signed-off-by: Daniel Drake
Signed-off-by: Thomas Gleixner
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: linux@endlessm.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190107034024.15005-1-drake@endlessm.com

Daniel Drake
2019-01-12 04:35:47 +0800

06 Jan, 2019

1 commit

172caf199 kbuild: remove redundant target cleaning on failure ... Browse Code »

Since commit 9c2af1c7377a ("kbuild: add .DELETE_ON_ERROR special
target"), the target file is automatically deleted on failure.

The boilerplate code

... || { rm -f $@; false; }

is unneeded.

Signed-off-by: Masahiro Yamada

Masahiro Yamada
2019-01-06 08:46:51 +0800

05 Jan, 2019

1 commit

170d13ca3 x86: re-introduce non-generic memcpy_{to,from}io ... Browse Code »

This has been broken forever, and nobody ever really noticed because
it's purely a performance issue.

Long long ago, in commit 6175ddf06b61 ("x86: Clean up mem*io functions")
Brian Gerst simplified the memory copies to and from iomem, since on
x86, the instructions to access iomem are exactly the same as the
regular instructions.

That is technically true, and things worked, and nobody said anything.
Besides, back then the regular memcpy was pretty simple and worked fine.

Nobody noticed except for David Laight, that is. David has a testing a
TLP monitor he was writing for an FPGA, and has been occasionally
complaining about how memcpy_toio() writes things one byte at a time.

Which is completely unacceptable from a performance standpoint, even if
it happens to technically work.

The reason it's writing one byte at a time is because while it's
technically true that accesses to iomem are the same as accesses to
regular memory on x86, the _granularity_ (and ordering) of accesses
matter to iomem in ways that they don't matter to regular cached memory.

In particular, when ERMS is set, we default to using "rep movsb" for
larger memory copies. That is indeed perfectly fine for real memory,
since the whole point is that the CPU is going to do cacheline
optimizations and executes the memory copy efficiently for cached
memory.

With iomem? Not so much. With iomem, "rep movsb" will indeed work, but
it will copy things one byte at a time. Slowly and ponderously.

Now, originally, back in 2010 when commit 6175ddf06b61 was done, we
didn't use ERMS, and this was much less noticeable.

Our normal memcpy() was simpler in other ways too.

Because in fact, it's not just about using the string instructions. Our
memcpy() these days does things like "read and write overlapping values"
to handle the last bytes of the copy. Again, for normal memory,
overlapping accesses isn't an issue. For iomem? It can be.

So this re-introduces the specialized memcpy_toio(), memcpy_fromio() and
memset_io() functions. It doesn't particularly optimize them, but it
tries to at least not be horrid, or do overlapping accesses. In fact,
this uses the existing __inline_memcpy() function that we still had
lying around that uses our very traditional "rep movsl" loop followed by
movsw/movsb for the final bytes.

Somebody may decide to try to improve on it, but if we've gone almost a
decade with only one person really ever noticing and complaining, maybe
it's not worth worrying about further, once it's not _completely_ broken?

Reported-by: David Laight
Signed-off-by: Linus Torvalds

Linus Torvalds
2019-01-05 10:15:33 +0800

04 Jan, 2019

1 commit

96d4f267e Remove 'type' argument from access_ok() function ... Browse Code »

Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.

It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.

A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.

This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

There were a couple of notable cases:

- csky still had the old "verify_area()" name as an alias.

- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)

- microblaze used the type argument for a debug printout

but other than those oddities this should be a total no-op patch.

I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.

Signed-off-by: Linus Torvalds

Linus Torvalds
2019-01-04 10:57:57 +0800

23 Oct, 2018

1 commit

e1d20beae Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were the fsgsbase related preparatory
patches from Chang S. Bae - but there's also an optimized
memcpy_flushcache() and a cleanup for the __cmpxchg_double() assembly
glue"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fsgsbase/64: Clean up various details
x86/segments: Introduce the 'CPUNODE' naming to better document the segment limit CPU/node NR trick
x86/vdso: Initialize the CPU/node NR segment descriptor earlier
x86/vdso: Introduce helper functions for CPU and node number
x86/segments/64: Rename the GDT PER_CPU entry to CPU_NUMBER
x86/fsgsbase/64: Factor out FS/GS segment loading from __switch_to()
x86/fsgsbase/64: Convert the ELF core dump code to the new FSGSBASE helpers
x86/fsgsbase/64: Make ptrace use the new FS/GS base helpers
x86/fsgsbase/64: Introduce FS/GS base helper functions
x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately
x86/asm: Use CC_SET()/CC_OUT() in __cmpxchg_double()
x86/asm: Optimize memcpy_flushcache()

Linus Torvalds
2018-10-23 22:24:22 +0800

10 Sep, 2018

1 commit

02101c45e x86/asm: Optimize memcpy_flushcache() ... Browse Code »

I use memcpy_flushcache() in my persistent memory driver for metadata
updates, there are many 8-byte and 16-byte updates and it turns out that
the overhead of memcpy_flushcache causes 2% performance degradation
compared to "movnti" instruction explicitly coded using inline assembler.

The tests were done on a Skylake processor with persistent memory emulated
using the "memmap" kernel parameter. dd was used to copy data to the
dm-writecache target.

This patch recognizes memcpy_flushcache calls with constant short length
and turns them into inline assembler - so that I don't have to use inline
assembler in the driver.

Signed-off-by: Mikulas Patocka
Cc: Dan Williams
Cc: Linus Torvalds
Cc: Mike Snitzer
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: device-mapper development
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1808081720460.24747@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar

Mikulas Patocka
2018-09-10 21:17:12 +0800

03 Sep, 2018

1 commit

75045f77f x86/extable: Introduce _ASM_EXTABLE_UA for uaccess fixups ... Browse Code »

Currently, most fixups for attempting to access userspace memory are
handled using _ASM_EXTABLE, which is also used for various other types of
fixups (e.g. safe MSR access, IRET failures, and a bunch of other things).
In order to make it possible to add special safety checks to uaccess fixups
(in particular, checking whether the fault address is actually in
userspace), introduce a new exception table handler ex_handler_uaccess()
and wire it up to all the user access fixups (excluding ones that
already use _ASM_EXTABLE_EX).

Signed-off-by: Jann Horn
Signed-off-by: Thomas Gleixner
Tested-by: Kees Cook
Cc: Andy Lutomirski
Cc: kernel-hardening@lists.openwall.com
Cc: dvyukov@google.com
Cc: Masami Hiramatsu
Cc: "Naveen N. Rao"
Cc: Anil S Keshavamurthy
Cc: "David S. Miller"
Cc: Alexander Viro
Cc: linux-fsdevel@vger.kernel.org
Cc: Borislav Petkov
Link: https://lkml.kernel.org/r/20180828201421.157735-5-jannh@google.com

Jann Horn
2018-09-03 21:12:09 +0800

31 Aug, 2018

1 commit

4012e77a9 x86/nmi: Fix NMI uaccess race against CR3 switching ... Browse Code »

A NMI can hit in the middle of context switching or in the middle of
switch_mm_irqs_off(). In either case, CR3 might not match current->mm,
which could cause copy_from_user_nmi() and friends to read the wrong
memory.

Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

Signed-off-by: Andy Lutomirski
Signed-off-by: Thomas Gleixner
Reviewed-by: Rik van Riel
Cc: Nadav Amit
Cc: Borislav Petkov
Cc: Jann Horn
Cc: Peter Zijlstra
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org

Andy Lutomirski
2018-08-31 23:08:22 +0800

03 Jul, 2018

1 commit

a7bea8308 x86/asm/64: Use 32-bit XOR to zero registers ... Browse Code »

Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms. Zeroing
idioms don't require execution bandwidth, as they're being taken care
of in the frontend (through register renaming). Use 32-bit XORs instead.

Signed-off-by: Jan Beulich
Cc: Alok Kataria
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Josh Poimboeuf
Cc: Juergen Gross
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: davem@davemloft.net
Cc: herbert@gondor.apana.org.au
Cc: pavel@ucw.cz
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/5B39FF1A02000078001CFB54@prv1-mh.provo.novell.com
Signed-off-by: Ingo Molnar

Jan Beulich
2018-07-03 15:59:29 +0800