08 Aug, 2019

1 commit

  • When building with W=1, warnings about missing prototypes are emitted:

    CC arch/x86/lib/cpu.o
    arch/x86/lib/cpu.c:5:14: warning: no previous prototype for 'x86_family' [-Wmissing-prototypes]
    5 | unsigned int x86_family(unsigned int sig)
    | ^~~~~~~~~~
    arch/x86/lib/cpu.c:18:14: warning: no previous prototype for 'x86_model' [-Wmissing-prototypes]
    18 | unsigned int x86_model(unsigned int sig)
    | ^~~~~~~~~
    arch/x86/lib/cpu.c:33:14: warning: no previous prototype for 'x86_stepping' [-Wmissing-prototypes]
    33 | unsigned int x86_stepping(unsigned int sig)
    | ^~~~~~~~~~~~

    Add the proper include file so the prototypes are there.

    Signed-off-by: Valdis Kletnieks
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/42513.1565234837@turing-police

    Valdis Klētnieks
     

19 Jul, 2019

3 commits

  • The same getuser/putuser error paths are used regardless of whether AC
    is set. In non-exception failure cases, this results in an unnecessary
    CLAC.

    Fixes the following warnings:

    arch/x86/lib/getuser.o: warning: objtool: .altinstr_replacement+0x18: redundant UACCESS disable
    arch/x86/lib/putuser.o: warning: objtool: .altinstr_replacement+0x18: redundant UACCESS disable

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/bc14ded2755ae75bd9010c446079e113dbddb74b.1563413318.git.jpoimboe@redhat.com

    Josh Poimboeuf
     
  • After adding mcsafe_handle_tail() to the objtool uaccess safe list,
    objtool reports:

    arch/x86/lib/usercopy_64.o: warning: objtool: mcsafe_handle_tail()+0x0: call to __fentry__() with UACCESS enabled

    With SMAP, this function is called with AC=1, so it needs to be careful
    about which functions it calls. Disable the ftrace entry hook, which
    can potentially pull in a lot of extra code.

    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/8e13d6f0da1c8a3f7603903da6cbf6d582bbfe10.1563413318.git.jpoimboe@redhat.com

    Josh Poimboeuf
     
  • After an objtool improvement, it's complaining about the CLAC in
    copy_user_handle_tail():

    arch/x86/lib/copy_user_64.o: warning: objtool: .altinstr_replacement+0x12: redundant UACCESS disable
    arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x6: (alt)
    arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x2: (alt)
    arch/x86/lib/copy_user_64.o: warning: objtool: copy_user_handle_tail()+0x0:
    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/6b6e436774678b4b9873811ff023bd29935bee5b.1563413318.git.jpoimboe@redhat.com

    Josh Poimboeuf
     

09 Jul, 2019

1 commit

  • Pull SMP/hotplug updates from Thomas Gleixner:
    "A small set of updates for SMP and CPU hotplug:

    - Abort disabling secondary CPUs in the freezer when a wakeup is
    pending instead of evaluating it only after all CPUs have been
    offlined.

    - Remove the shared annotation for the strict per CPU cfd_data in the
    smp function call core code.

    - Remove the return values of smp_call_function() and on_each_cpu()
    as they are unconditionally 0. Fixup the few callers which actually
    bothered to check the return value"

    * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    smp: Remove smp_call_function() and on_each_cpu() return values
    smp: Do not mark call_function_data as shared
    cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending
    cpu/hotplug: Fix notify_cpu_starting() reference in bringup_wait_for_ap()

    Linus Torvalds
     

23 Jun, 2019

1 commit

  • The return value is fixed. Remove it and amend the callers.

    [ tglx: Fixup arm/bL_switcher and powerpc/rtas ]

    Signed-off-by: Nadav Amit
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Andrew Morton
    Link: https://lkml.kernel.org/r/20190613064813.8102-2-namit@vmware.com

    Nadav Amit
     

19 Jun, 2019

2 commits

  • Based on 1 normalized pattern(s):

    this file is free software you can redistribute it and or modify it
    under the terms of version 2 of the gnu general public license as
    published by the free software foundation this program is
    distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program if not write to the free
    software foundation inc 51 franklin st fifth floor boston ma 02110
    1301 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 8 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081207.443595178@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • Based on 1 normalized pattern(s):

    this file is part of the linux kernel and is made available under
    the terms of the gnu general public license version 2

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 28 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Allison Randal
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.534229504@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

09 Jun, 2019

1 commit

  • Pull yet more SPDX updates from Greg KH:
    "Another round of SPDX header file fixes for 5.2-rc4

    These are all more "GPL-2.0-or-later" or "GPL-2.0-only" tags being
    added, based on the text in the files. We are slowly chipping away at
    the 700+ different ways people tried to write the license text. All of
    these were reviewed on the spdx mailing list by a number of different
    people.

    We now have over 60% of the kernel files covered with SPDX tags:
    $ ./scripts/spdxcheck.py -v 2>&1 | grep Files
    Files checked: 64533
    Files with SPDX: 40392
    Files with errors: 0

    I think the majority of the "easy" fixups are now done, it's now the
    start of the longer-tail of crazy variants to wade through"

    * tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (159 commits)
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 450
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 449
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 448
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 446
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 445
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 444
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 443
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 442
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 440
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 438
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 435
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 434
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 432
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 431
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 429
    ...

    Linus Torvalds
     

08 Jun, 2019

1 commit

  • get_desc() computes a pointer into the LDT while holding a lock that
    protects the LDT from being freed, but then drops the lock and returns the
    (now potentially dangling) pointer to its caller.

    Fix it by giving the caller a copy of the LDT entry instead.

    Fixes: 670f928ba09b ("x86/insn-eval: Add utility function to get segment descriptor")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jann Horn
    Signed-off-by: Linus Torvalds

    Jann Horn
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation version 2 of the license

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 315 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Armijn Hemel
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

31 May, 2019

4 commits

  • Based on 1 normalized pattern(s):

    subject to the gnu public license v2

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 1 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Alexios Zavras
    Reviewed-by: Allison Randal
    Reviewed-by: Steve Winslow
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190528171440.222651153@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • Based on 1 normalized pattern(s):

    subject to the gnu public license v 2

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 9 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Alexios Zavras
    Reviewed-by: Steve Winslow
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190528171440.130801526@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version this program is distributed in the
    hope that it will be useful but without any warranty without even
    the implied warranty of merchantability or fitness for a particular
    purpose see the gnu general public license for more details you
    should have received a copy of the gnu general public license along
    with this program if not write to the free software foundation inc
    59 temple place suite 330 boston ma 02111 1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 1334 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Richard Fontana
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

07 May, 2019

4 commits

  • Pull x86 timer updates from Ingo Molnar:
    "Two changes: an LTO improvement, plus the new 'nowatchdog' boot option
    to disable the clocksource watchdog"

    * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/timer: Don't inline __const_udelay()
    x86/tsc: Add option to disable tsc clocksource watchdog

    Linus Torvalds
     
  • Pull x86 asm updates from Ingo Molnar:
    "This includes the following changes:

    - cpu_has() cleanups

    - sync_bitops.h modernization to the rmwcc.h facility, similarly to
    bitops.h

    - continued LTO annotations/fixes

    - misc cleanups and smaller cleanups"

    * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/um/vdso: Drop unnecessary cc-ldoption
    x86/vdso: Rename variable to fix -Wshadow warning
    x86/cpu/amd: Exclude 32bit only assembler from 64bit build
    x86/asm: Mark all top level asm statements as .text
    x86/build/vdso: Add FORCE to the build rule of %.so
    x86/asm: Modernize sync_bitops.h
    x86/mm: Convert some slow-path static_cpu_has() callers to boot_cpu_has()
    x86: Convert some slow-path static_cpu_has() callers to boot_cpu_has()
    x86/asm: Clarify static_cpu_has()'s intended use
    x86/uaccess: Fix implicit cast of __user pointer
    x86/cpufeature: Remove __pure attribute to _static_cpu_has()

    Linus Torvalds
     
  • Pull locking updates from Ingo Molnar:
    "Here are the locking changes in this cycle:

    - rwsem unification and simpler micro-optimizations to prepare for
    more intrusive (and more lucrative) scalability improvements in
    v5.3 (Waiman Long)

    - Lockdep irq state tracking flag usage cleanups (Frederic
    Weisbecker)

    - static key improvements (Jakub Kicinski, Peter Zijlstra)

    - misc updates, cleanups and smaller fixes"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
    locking/lockdep: Remove unnecessary unlikely()
    locking/static_key: Don't take sleeping locks in __static_key_slow_dec_deferred()
    locking/static_key: Factor out the fast path of static_key_slow_dec()
    locking/static_key: Add support for deferred static branches
    locking/lockdep: Test all incompatible scenarios at once in check_irq_usage()
    locking/lockdep: Avoid bogus Clang warning
    locking/lockdep: Generate LOCKF_ bit composites
    locking/lockdep: Use expanded masks on find_usage_*() functions
    locking/lockdep: Map remaining magic numbers to lock usage mask names
    locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
    locking/rwsem: Prevent unneeded warning during locking selftest
    locking/rwsem: Optimize rwsem structure for uncontended lock acquisition
    locking/rwsem: Enable lock event counting
    locking/lock_events: Don't show pvqspinlock events on bare metal
    locking/lock_events: Make lock_events available for all archs & other locks
    locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs
    locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro
    locking/rwsem: Add debug check for __down_read*()
    locking/rwsem: Micro-optimize rwsem_try_read_lock_unqueued()
    locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.h
    ...

    Linus Torvalds
     
  • Pull objtool updates from Ingo Molnar:
    "This is a series from Peter Zijlstra that adds x86 build-time uaccess
    validation of SMAP to objtool, which will detect and warn about the
    following uaccess API usage bugs and weirdnesses:

    - call to %s() with UACCESS enabled
    - return with UACCESS enabled
    - return with UACCESS disabled from a UACCESS-safe function
    - recursive UACCESS enable
    - redundant UACCESS disable
    - UACCESS-safe disables UACCESS

    As it turns out not leaking uaccess permissions outside the intended
    uaccess functionality is hard when the interfaces are complex and when
    such bugs are mostly dormant.

    As a bonus we now also check the DF flag. We had at least one
    high-profile bug in that area in the early days of Linux, and the
    checking is fairly simple. The checks performed and warnings emitted
    are:

    - call to %s() with DF set
    - return with DF set
    - return with modified stack frame
    - recursive STD
    - redundant CLD

    It's all x86-only for now, but later on this can also be used for PAN
    on ARM and objtool is fairly cross-platform in principle.

    While all warnings emitted by this new checking facility that got
    reported to us were fixed, there might be GCC version dependent
    warnings that were not reported yet - which we'll address, should they
    trigger.

    The warnings are non-fatal build warnings"

    * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
    mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions
    x86/uaccess: Dont leak the AC flag into __put_user() argument evaluation
    sched/x86_64: Don't save flags on context switch
    objtool: Add Direction Flag validation
    objtool: Add UACCESS validation
    objtool: Fix sibling call detection
    objtool: Rewrite alt->skip_orig
    objtool: Add --backtrace support
    objtool: Rewrite add_ignores()
    objtool: Handle function aliases
    objtool: Set insn->func for alternatives
    x86/uaccess, kcov: Disable stack protector
    x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP
    x86/uaccess, ubsan: Fix UBSAN vs. SMAP
    x86/uaccess, kasan: Fix KASAN vs SMAP
    x86/smap: Ditch __stringify()
    x86/uaccess: Introduce user_access_{save,restore}()
    x86/uaccess, signal: Fix AC=1 bloat
    x86/uaccess: Always inline user_access_begin()
    x86/uaccess, xen: Suppress SMAP warnings
    ...

    Linus Torvalds
     

30 Apr, 2019

1 commit

  • Enablement of AMD's Secure Memory Encryption feature is determined very
    early after start_kernel() is entered. Part of this procedure involves
    scanning the command line for the parameter 'mem_encrypt'.

    To determine intended state, the function sme_enable() uses library
    functions cmdline_find_option() and strncmp(). Their use occurs early
    enough such that it cannot be assumed that any instrumentation subsystem
    is initialized.

    For example, making calls to a KASAN-instrumented function before KASAN
    is set up will result in the use of uninitialized memory and a boot
    failure.

    When AMD's SME support is enabled, conditionally disable instrumentation
    of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.

    [ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]

    Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
    Reported-by: Li RongQing
    Signed-off-by: Gary R Hook
    Signed-off-by: Borislav Petkov
    Cc: Alexander Shishkin
    Cc: Andrew Morton
    Cc: Andy Shevchenko
    Cc: Boris Brezillon
    Cc: Coly Li
    Cc: "dave.hansen@linux.intel.com"
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: Kent Overstreet
    Cc: "luto@kernel.org"
    Cc: Masahiro Yamada
    Cc: Matthew Wilcox
    Cc: "mingo@redhat.com"
    Cc: "peterz@infradead.org"
    Cc: Sebastian Andrzej Siewior
    Cc: Thomas Gleixner
    Cc: x86-ml
    Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.com

    Gary Hook
     

19 Apr, 2019

2 commits

  • LTO will happily inline __const_udelay() everywhere it is used. Forcing it
    noinline saves ~44k text in a LTO build.

    13999560 1740864 1499136 17239560 1070e08 vmlinux-with-udelay-inline
    13954764 1736768 1499136 17190668 1064f0c vmlinux-wo-udelay-inline

    Even without LTO this function should never be inlined.

    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190330004743.29541-4-andi@firstfloor.org

    Andi Kleen
     
  • With gcc toplevel assembler statements that do not mark themselves as .text
    may end up in other sections. This causes LTO boot crashes because various
    assembler statements ended up in the middle of the initcall section. It's
    also a latent problem without LTO, although it's currently not known to
    cause any real problems.

    According to the gcc team it's expected behavior.

    Always mark all the top level assembler statements as text so that they
    switch to the right section.

    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190330004743.29541-1-andi@firstfloor.org

    Andi Kleen
     

03 Apr, 2019

3 commits

  • As the generic rwsem-xadd code is using the appropriate acquire and
    release versions of the atomic operations, the arch specific rwsem.h
    files will not be that much faster than the generic code as long as the
    atomic functions are properly implemented. So we can remove those arch
    specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
    effort.

    Currently, only x86, alpha and ia64 have implemented architecture
    specific fast paths. I don't have access to alpha and ia64 systems for
    testing, but they are legacy systems that are not likely to be updated
    to the latest kernel anyway.

    By using a rwsem microbenchmark, the total locking rates on a 4-socket
    56-core 112-thread x86-64 system before and after the patch were as
    follows (mixed means equal # of read and write locks):

    Before Patch After Patch
    # of Threads wlock rlock mixed wlock rlock mixed
    ------------ ----- ----- ----- ----- ----- -----
    1 29,201 30,143 29,458 28,615 30,172 29,201
    2 6,807 13,299 1,171 7,725 15,025 1,804
    4 6,504 12,755 1,520 7,127 14,286 1,345
    8 6,762 13,412 764 6,826 13,652 726
    16 6,693 15,408 662 6,599 15,938 626
    32 6,145 15,286 496 5,549 15,487 511
    64 5,812 15,495 60 5,858 15,572 60

    There were some run-to-run variations for the multi-thread tests. For
    x86-64, using the generic C code fast path seems to be a little bit
    faster than the assembly version with low lock contention. Looking at
    the assembly version of the fast paths, there are assembly to/from C
    code wrappers that save and restore all the callee-clobbered registers
    (7 registers on x86-64). The assembly generated from the generic C
    code doesn't need to do that. That may explain the slight performance
    gain here.

    The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
    with no code change as no other code other than those under
    kernel/locking needs to access the internal rwsem macros and functions.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Davidlohr Bueso
    Cc: H. Peter Anvin
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • New tooling got confused about this:

    arch/x86/lib/memcpy_64.o: warning: objtool: .fixup+0x7: return with UACCESS enabled

    While the code isn't wrong, it is tedious (if at all possible) to
    figure out what function a particular chunk of .fixup belongs to.

    This then confuses the objtool uaccess validation. Instead of
    returning directly from the .fixup, jump back into the right function.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • By writing the function in asm we avoid cross object code flow and
    objtool no longer gets confused about a 'stray' CLAC.

    Also; the asm version is actually _simpler_.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

21 Mar, 2019

1 commit

  • The increment of buff is indented one level too deeply, clean
    this up by removing a tab.

    Signed-off-by: Colin Ian King
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: "H . Peter Anvin"
    Cc: kernel-janitors@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190314230838.18256-1-colin.king@canonical.com

    Colin Ian King
     

08 Mar, 2019

1 commit

  • Pull x86 cleanups from Ingo Molnar:
    "Various cleanups and simplifications, none of them really stands out,
    they are all over the place"

    * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/uaccess: Remove unused __addr_ok() macro
    x86/smpboot: Remove unused phys_id variable
    x86/mm/dump_pagetables: Remove the unused prev_pud variable
    x86/fpu: Move init_xstate_size() to __init section
    x86/cpu_entry_area: Move percpu_setup_debug_store() to __init section
    x86/mtrr: Remove unused variable
    x86/boot/compressed/64: Explain paging_prepare()'s return value
    x86/resctrl: Remove duplicate MSR_MISC_FEATURE_CONTROL definition
    x86/asm/suspend: Drop ENTRY from local data
    x86/hw_breakpoints, kprobes: Remove kprobes ifdeffery
    x86/boot: Save several bytes in decompressor
    x86/trap: Remove useless declaration
    x86/mm/tlb: Remove unused cpu variable
    x86/events: Mark expected switch-case fall-throughs
    x86/asm-prototypes: Remove duplicate include
    x86/kernel: Mark expected switch-case fall-throughs
    x86/insn-eval: Mark expected switch-case fall-through
    x86/platform/UV: Replace kmalloc() and memset() with k[cz]alloc() calls
    x86/e820: Replace kmalloc() + memcpy() with kmemdup()

    Linus Torvalds
     

06 Mar, 2019

1 commit

  • The descriptions of userspace memory access functions had minor issues
    with formatting that made kernel-doc unable to properly detect the
    function/macro names and the return value sections:

    ./arch/x86/include/asm/uaccess.h:80: info: Scanning doc for
    ./arch/x86/include/asm/uaccess.h:139: info: Scanning doc for
    ./arch/x86/include/asm/uaccess.h:231: info: Scanning doc for
    ./arch/x86/include/asm/uaccess.h:505: info: Scanning doc for
    ./arch/x86/include/asm/uaccess.h:530: info: Scanning doc for
    ./arch/x86/lib/usercopy_32.c:58: info: Scanning doc for
    ./arch/x86/lib/usercopy_32.c:69: warning: No description found for return
    value of 'clear_user'
    ./arch/x86/lib/usercopy_32.c:78: info: Scanning doc for
    ./arch/x86/lib/usercopy_32.c:90: warning: No description found for return
    value of '__clear_user'

    Fix the formatting.

    Link: http://lkml.kernel.org/r/1549549644-4903-3-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

02 Feb, 2019

1 commit

  • In commit 170d13ca3a2f ("x86: re-introduce non-generic memcpy_{to,from}io")
    I made our copy from IO space use a separate copy routine rather than
    rely on the generic memcpy. I did that because our generic memory copy
    isn't actually well-defined when it comes to internal access ordering or
    alignment, and will in fact depend on various CPUID flags.

    In particular, the default memcpy() for a modern Intel CPU will
    generally be just a "rep movsb", which works reasonably well for
    medium-sized memory copies of regular RAM, since the CPU will turn it
    into fairly optimized microcode.

    However, for non-cached memory and IO, "rep movs" ends up being
    horrendously slow and will just do the architectural "one byte at a
    time" accesses implied by the movsb.

    At the other end of the spectrum, if you _don't_ end up using the "rep
    movsb" code, you'd likely fall back to the software copy, which does
    overlapping accesses for the tail, and may copy things backwards.
    Again, for regular memory that's fine, for IO memory not so much.

    The thinking was that clearly nobody really cared (because things
    worked), but some people had seen horrible performance due to the byte
    accesses, so let's just revert back to our long ago version that dod
    "rep movsl" for the bulk of the copy, and then fixed up the potentially
    last few bytes of the tail with "movsw/b".

    Interestingly (and perhaps not entirely surprisingly), while that was
    our original memory copy implementation, and had been used before for
    IO, in the meantime many new users of memcpy_*io() had come about. And
    while the access patterns for the memory copy weren't well-defined (so
    arguably _any_ access pattern should work), in practice the "rep movsb"
    case had been very common for the last several years.

    In particular Jarkko Sakkinen reported that the memcpy_*io() change
    resuled in weird errors from his Geminilake NUC TPM module.

    And it turns out that the TPM TCG accesses according to spec require
    that the accesses be

    (a) done strictly sequentially

    (b) be naturally aligned

    otherwise the TPM chip will abort the PCI transaction.

    And, in fact, the tpm_crb.c driver did this:

    memcpy_fromio(buf, priv->rsp, 6);
    ...
    memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);

    which really should never have worked in the first place, but back
    before commit 170d13ca3a2f it *happened* to work, because the
    memcpy_fromio() would be expanded to a regular memcpy, and

    (a) gcc would expand the first memcpy in-line, and turn it into a
    4-byte and a 2-byte read, and they happened to be in the right
    order, and the alignment was right.

    (b) gcc would call "memcpy()" for the second one, and the machines that
    had this TPM chip also apparently ended up always having ERMS
    ("Enhanced REP MOVSB/STOSB instructions"), so we'd use the "rep
    movbs" for that copy.

    In other words, basically by pure luck, the code happened to use the
    right access sizes in the (two different!) memcpy() implementations to
    make it all work.

    But after commit 170d13ca3a2f, both of the memcpy_fromio() calls
    resulted in a call to the routine with the consistent memory accesses,
    and in both cases it started out transferring with 4-byte accesses.
    Which worked for the first copy, but resulted in the second copy doing a
    32-bit read at an address that was only 2-byte aligned.

    Jarkko is actually fixing the fragile code in the TPM driver, but since
    this is an excellent example of why we absolutely must not use a generic
    memcpy for IO accesses, _and_ an IO-specific one really should strive to
    align the IO accesses, let's do exactly that.

    Side note: Jarkko also noted that the driver had been used on ARM
    platforms, and had worked. That was because on 32-bit ARM, memcpy_*io()
    ends up always doing byte accesses, and on 64-bit ARM it first does byte
    accesses to align to 8-byte boundaries, and then does 8-byte accesses
    for the bulk.

    So ARM actually worked by design, and the x86 case worked by pure luck.

    We *might* want to make x86-64 do the 8-byte case too. That should be a
    pretty straightforward extension, but let's do one thing at a time. And
    generally MMIO accesses aren't really all that performance-critical, as
    shown by the fact that for a long time we just did them a byte at a
    time, and very few people ever noticed.

    Reported-and-tested-by: Jarkko Sakkinen
    Tested-by: Jerry Snitselaar
    Cc: David Laight
    Fixes: 170d13ca3a2f ("x86: re-introduce non-generic memcpy_{to,from}io")
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

26 Jan, 2019

1 commit

  • In preparation to enable -Wimplicit-fallthrough by default, mark
    switch-case statements where fall-through is intentional, explicitly.

    Thus fix the following warning:

    arch/x86/lib/insn-eval.c: In function ‘resolve_default_seg’:
    arch/x86/lib/insn-eval.c:179:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (insn->addr_bytes == 2)
    ^
    arch/x86/lib/insn-eval.c:182:2: note: here
    case -EDOM:
    ^~~~

    Warning level 3 was used: -Wimplicit-fallthrough=3

    This is part of the ongoing efforts to enable -Wimplicit-fallthrough by
    default.

    [ bp: Massage commit message. ]

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Borislav Petkov
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: Thomas Gleixner
    Cc: x86-ml
    Link: https://lkml.kernel.org/r/20190125205520.GA9602@embeddedor

    Gustavo A. R. Silva
     

12 Jan, 2019

1 commit

  • The outb() function takes parameters value and port, in that order. Fix
    the parameters used in the kalsr i8254 fallback code.

    Fixes: 5bfce5ef55cb ("x86, kaslr: Provide randomness functions")
    Signed-off-by: Daniel Drake
    Signed-off-by: Thomas Gleixner
    Cc: bp@alien8.de
    Cc: hpa@zytor.com
    Cc: linux@endlessm.com
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190107034024.15005-1-drake@endlessm.com

    Daniel Drake
     

06 Jan, 2019

1 commit


05 Jan, 2019

1 commit

  • This has been broken forever, and nobody ever really noticed because
    it's purely a performance issue.

    Long long ago, in commit 6175ddf06b61 ("x86: Clean up mem*io functions")
    Brian Gerst simplified the memory copies to and from iomem, since on
    x86, the instructions to access iomem are exactly the same as the
    regular instructions.

    That is technically true, and things worked, and nobody said anything.
    Besides, back then the regular memcpy was pretty simple and worked fine.

    Nobody noticed except for David Laight, that is. David has a testing a
    TLP monitor he was writing for an FPGA, and has been occasionally
    complaining about how memcpy_toio() writes things one byte at a time.

    Which is completely unacceptable from a performance standpoint, even if
    it happens to technically work.

    The reason it's writing one byte at a time is because while it's
    technically true that accesses to iomem are the same as accesses to
    regular memory on x86, the _granularity_ (and ordering) of accesses
    matter to iomem in ways that they don't matter to regular cached memory.

    In particular, when ERMS is set, we default to using "rep movsb" for
    larger memory copies. That is indeed perfectly fine for real memory,
    since the whole point is that the CPU is going to do cacheline
    optimizations and executes the memory copy efficiently for cached
    memory.

    With iomem? Not so much. With iomem, "rep movsb" will indeed work, but
    it will copy things one byte at a time. Slowly and ponderously.

    Now, originally, back in 2010 when commit 6175ddf06b61 was done, we
    didn't use ERMS, and this was much less noticeable.

    Our normal memcpy() was simpler in other ways too.

    Because in fact, it's not just about using the string instructions. Our
    memcpy() these days does things like "read and write overlapping values"
    to handle the last bytes of the copy. Again, for normal memory,
    overlapping accesses isn't an issue. For iomem? It can be.

    So this re-introduces the specialized memcpy_toio(), memcpy_fromio() and
    memset_io() functions. It doesn't particularly optimize them, but it
    tries to at least not be horrid, or do overlapping accesses. In fact,
    this uses the existing __inline_memcpy() function that we still had
    lying around that uses our very traditional "rep movsl" loop followed by
    movsw/movsb for the final bytes.

    Somebody may decide to try to improve on it, but if we've gone almost a
    decade with only one person really ever noticing and complaining, maybe
    it's not worth worrying about further, once it's not _completely_ broken?

    Reported-by: David Laight
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

04 Jan, 2019

1 commit

  • Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
    of the user address range verification function since we got rid of the
    old racy i386-only code to walk page tables by hand.

    It existed because the original 80386 would not honor the write protect
    bit when in kernel mode, so you had to do COW by hand before doing any
    user access. But we haven't supported that in a long time, and these
    days the 'type' argument is a purely historical artifact.

    A discussion about extending 'user_access_begin()' to do the range
    checking resulted this patch, because there is no way we're going to
    move the old VERIFY_xyz interface to that model. And it's best done at
    the end of the merge window when I've done most of my merges, so let's
    just get this done once and for all.

    This patch was mostly done with a sed-script, with manual fix-ups for
    the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

    There were a couple of notable cases:

    - csky still had the old "verify_area()" name as an alias.

    - the iter_iov code had magical hardcoded knowledge of the actual
    values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
    really used it)

    - microblaze used the type argument for a debug printout

    but other than those oddities this should be a total no-op patch.

    I tried to fix up all architectures, did fairly extensive grepping for
    access_ok() uses, and the changes are trivial, but I may have missed
    something. Any missed conversion should be trivially fixable, though.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

23 Oct, 2018

1 commit

  • Pull x86 asm updates from Ingo Molnar:
    "The main changes in this cycle were the fsgsbase related preparatory
    patches from Chang S. Bae - but there's also an optimized
    memcpy_flushcache() and a cleanup for the __cmpxchg_double() assembly
    glue"

    * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/fsgsbase/64: Clean up various details
    x86/segments: Introduce the 'CPUNODE' naming to better document the segment limit CPU/node NR trick
    x86/vdso: Initialize the CPU/node NR segment descriptor earlier
    x86/vdso: Introduce helper functions for CPU and node number
    x86/segments/64: Rename the GDT PER_CPU entry to CPU_NUMBER
    x86/fsgsbase/64: Factor out FS/GS segment loading from __switch_to()
    x86/fsgsbase/64: Convert the ELF core dump code to the new FSGSBASE helpers
    x86/fsgsbase/64: Make ptrace use the new FS/GS base helpers
    x86/fsgsbase/64: Introduce FS/GS base helper functions
    x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately
    x86/asm: Use CC_SET()/CC_OUT() in __cmpxchg_double()
    x86/asm: Optimize memcpy_flushcache()

    Linus Torvalds
     

10 Sep, 2018

1 commit

  • I use memcpy_flushcache() in my persistent memory driver for metadata
    updates, there are many 8-byte and 16-byte updates and it turns out that
    the overhead of memcpy_flushcache causes 2% performance degradation
    compared to "movnti" instruction explicitly coded using inline assembler.

    The tests were done on a Skylake processor with persistent memory emulated
    using the "memmap" kernel parameter. dd was used to copy data to the
    dm-writecache target.

    This patch recognizes memcpy_flushcache calls with constant short length
    and turns them into inline assembler - so that I don't have to use inline
    assembler in the driver.

    Signed-off-by: Mikulas Patocka
    Cc: Dan Williams
    Cc: Linus Torvalds
    Cc: Mike Snitzer
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: device-mapper development
    Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1808081720460.24747@file01.intranet.prod.int.rdu2.redhat.com
    Signed-off-by: Ingo Molnar

    Mikulas Patocka
     

03 Sep, 2018

1 commit

  • Currently, most fixups for attempting to access userspace memory are
    handled using _ASM_EXTABLE, which is also used for various other types of
    fixups (e.g. safe MSR access, IRET failures, and a bunch of other things).
    In order to make it possible to add special safety checks to uaccess fixups
    (in particular, checking whether the fault address is actually in
    userspace), introduce a new exception table handler ex_handler_uaccess()
    and wire it up to all the user access fixups (excluding ones that
    already use _ASM_EXTABLE_EX).

    Signed-off-by: Jann Horn
    Signed-off-by: Thomas Gleixner
    Tested-by: Kees Cook
    Cc: Andy Lutomirski
    Cc: kernel-hardening@lists.openwall.com
    Cc: dvyukov@google.com
    Cc: Masami Hiramatsu
    Cc: "Naveen N. Rao"
    Cc: Anil S Keshavamurthy
    Cc: "David S. Miller"
    Cc: Alexander Viro
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Borislav Petkov
    Link: https://lkml.kernel.org/r/20180828201421.157735-5-jannh@google.com

    Jann Horn
     

31 Aug, 2018

1 commit

  • A NMI can hit in the middle of context switching or in the middle of
    switch_mm_irqs_off(). In either case, CR3 might not match current->mm,
    which could cause copy_from_user_nmi() and friends to read the wrong
    memory.

    Fix it by adding a new nmi_uaccess_okay() helper and checking it in
    copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Rik van Riel
    Cc: Nadav Amit
    Cc: Borislav Petkov
    Cc: Jann Horn
    Cc: Peter Zijlstra
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org

    Andy Lutomirski
     

03 Jul, 2018

1 commit

  • Some Intel CPUs don't recognize 64-bit XORs as zeroing idioms. Zeroing
    idioms don't require execution bandwidth, as they're being taken care
    of in the frontend (through register renaming). Use 32-bit XORs instead.

    Signed-off-by: Jan Beulich
    Cc: Alok Kataria
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: davem@davemloft.net
    Cc: herbert@gondor.apana.org.au
    Cc: pavel@ucw.cz
    Cc: rjw@rjwysocki.net
    Link: http://lkml.kernel.org/r/5B39FF1A02000078001CFB54@prv1-mh.provo.novell.com
    Signed-off-by: Ingo Molnar

    Jan Beulich