25 Jan, 2021
1 commit
-
Change-Id: I173bc0325c93daaf82bb894f77ba35fa827864de
17 Jan, 2021
1 commit
-
commit 2ca408d9c749c32288bc28725f9f12ba30299e8f upstream.
Commit
121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")
converted native x86-32 which take 64-bit arguments to use the
compat handlers to allow conversion to passing args via pt_regs.
sys_fanotify_mark() was however missed, as it has a general compat
handler. Add a config option that will use the syscall wrapper that
takes the split args for native 32-bit.[ bp: Fix typo in Kconfig help text. ]
Fixes: 121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")
Reported-by: Paweł Jasiak
Signed-off-by: Brian Gerst
Signed-off-by: Borislav Petkov
Acked-by: Jan Kara
Acked-by: Andy Lutomirski
Link: https://lkml.kernel.org/r/20201130223059.101286-1-brgerst@gmail.com
Signed-off-by: Greg Kroah-Hartman
17 Dec, 2020
2 commits
-
Pass code model and stack alignment to the linker as these are not
stored in LLVM bitcode, and allow CONFIG_LTO_CLANG* to be enabled.Bug: 145210207
Change-Id: I50055b0be4fdf9f93b770d8b651f8c54dbda584e
Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/
Signed-off-by: Sami Tolvanen
Reviewed-by: Kees Cook -
Select HAVE_OBJTOOL_MCOUNT if STACK_VALIDATION is selected to use
objtool to generate __mcount_loc sections for dynamic ftrace with
Clang and gcc
Reviewed-by: Kees Cook
01 Dec, 2020
1 commit
-
Currently, '--orphan-handling=warn' is spread out across four different
architectures in their respective Makefiles, which makes it a little
unruly to deal with in case it needs to be disabled for a specific
linker version (in this case, ld.lld 10.0.1).To make it easier to control this, hoist this warning into Kconfig and
the main Makefile so that disabling it is simpler, as the warning will
only be enabled in a couple places (main Makefile and a couple of
compressed boot folders that blow away LDFLAGS_vmlinx) and making it
conditional is easier due to Kconfig syntax. One small additional
benefit of this is saving a call to ld-option on incremental builds
because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN.To keep the list of supported architectures the same, introduce
CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to
gain this automatically after all of the sections are specified and size
asserted. A special thanks to Kees Cook for the help text on this
config.Link: https://github.com/ClangBuiltLinux/linux/issues/1187
Acked-by: Kees Cook
Acked-by: Michael Ellerman (powerpc)
Reviewed-by: Nick Desaulniers
Tested-by: Nick Desaulniers
Signed-off-by: Nathan Chancellor
Signed-off-by: Masahiro Yamada
15 Oct, 2020
1 commit
-
Pull x86 SEV-ES support from Borislav Petkov:
"SEV-ES enhances the current guest memory encryption support called SEV
by also encrypting the guest register state, making the registers
inaccessible to the hypervisor by en-/decrypting them on world
switches. Thus, it adds additional protection to Linux guests against
exfiltration, control flow and rollback attacks.With SEV-ES, the guest is in full control of what registers the
hypervisor can access. This is provided by a guest-host exchange
mechanism based on a new exception vector called VMM Communication
Exception (#VC), a new instruction called VMGEXIT and a shared
Guest-Host Communication Block which is a decrypted page shared
between the guest and the hypervisor.Intercepts to the hypervisor become #VC exceptions in an SEV-ES guest
so in order for that exception mechanism to work, the early x86 init
code needed to be made able to handle exceptions, which, in itself,
brings a bunch of very nice cleanups and improvements to the early
boot code like an early page fault handler, allowing for on-demand
building of the identity mapping. With that, !KASLR configurations do
not use the EFI page table anymore but switch to a kernel-controlled
one.The main part of this series adds the support for that new exchange
mechanism. The goal has been to keep this as much as possibly separate
from the core x86 code by concentrating the machinery in two
SEV-ES-specific files:arch/x86/kernel/sev-es-shared.c
arch/x86/kernel/sev-es.cOther interaction with core x86 code has been kept at minimum and
behind static keys to minimize the performance impact on !SEV-ES
setups.Work by Joerg Roedel and Thomas Lendacky and others"
* tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
x86/sev-es: Use GHCB accessor for setting the MMIO scratch buffer
x86/sev-es: Check required CPU features for SEV-ES
x86/efi: Add GHCB mappings when SEV-ES is active
x86/sev-es: Handle NMI State
x86/sev-es: Support CPU offline/online
x86/head/64: Don't call verify_cpu() on starting APs
x86/smpboot: Load TSS and getcpu GDT entry before loading IDT
x86/realmode: Setup AP jump table
x86/realmode: Add SEV-ES specific trampoline entry point
x86/vmware: Add VMware-specific handling for VMMCALL under SEV-ES
x86/kvm: Add KVM-specific VMMCALL handling under SEV-ES
x86/paravirt: Allow hypervisor-specific VMMCALL handling under SEV-ES
x86/sev-es: Handle #DB Events
x86/sev-es: Handle #AC Events
x86/sev-es: Handle VMMCALL Events
x86/sev-es: Handle MWAIT/MWAITX Events
x86/sev-es: Handle MONITOR/MONITORX Events
x86/sev-es: Handle INVD Events
x86/sev-es: Handle RDPMC Events
x86/sev-es: Handle RDTSC(P) Events
...
14 Oct, 2020
1 commit
-
Pull seccomp updates from Kees Cook:
"The bulk of the changes are with the seccomp selftests to accommodate
some powerpc-specific behavioral characteristics. Additional cleanups,
fixes, and improvements are also included:- heavily refactor seccomp selftests (and clone3 selftests
dependency) to fix powerpc (Kees Cook, Thadeu Lima de Souza
Cascardo)- fix style issue in selftests (Zou Wei)
- upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich
Felker)- replace task_pt_regs(current) with current_pt_regs() (Denis
Efremov)- fix corner-case race in USER_NOTIF (Jann Horn)
- make CONFIG_SECCOMP no longer per-arch (YiFei Zhu)"
* tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
seccomp: Make duplicate listener detection non-racy
seccomp: Move config option SECCOMP to arch/Kconfig
selftests/clone3: Avoid OS-defined clone_args
selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit
selftests/seccomp: Allow syscall nr and ret value to be set separately
selftests/seccomp: Record syscall during ptrace entry
selftests/seccomp: powerpc: Fix seccomp return value testing
selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET
selftests/seccomp: Avoid redundant register flushes
selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG
selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG
selftests/seccomp: Remove syscall setting #ifdefs
selftests/seccomp: mips: Remove O32-specific macro
selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro
selftests/seccomp: arm: Define SYSCALL_NUM_SET macro
selftests/seccomp: mips: Define SYSCALL_NUM_SET macro
selftests/seccomp: Provide generic syscall setting macro
selftests/seccomp: Refactor arch register macros to avoid xtensa special case
selftests/seccomp: Use __NR_mknodat instead of __NR_mknod
selftests/seccomp: Use bitwise instead of arithmetic operator for flags
...
13 Oct, 2020
1 commit
-
Pull static call support from Ingo Molnar:
"This introduces static_call(), which is the idea of static_branch()
applied to indirect function calls. Remove a data load (indirection)
by modifying the text.They give the flexibility of function pointers, but with better
performance. (This is especially important for cases where retpolines
would otherwise be used, as retpolines can be pretty slow.)API overview:
DECLARE_STATIC_CALL(name, func);
DEFINE_STATIC_CALL(name, func);
DEFINE_STATIC_CALL_NULL(name, typename);static_call(name)(args...);
static_call_cond(name)(args...);
static_call_update(name, func);x86 is supported via text patching, otherwise basic indirect calls are
used, with function pointers.There's a second variant using inline code patching, inspired by
jump-labels, implemented on x86 as well.The new APIs are utilized in the x86 perf code, a heavy user of
function pointers, where static calls speed up the PMU handler by
4.2% (!).The generic implementation is not really excercised on other
architectures, outside of the trivial test_static_call_init()
self-test"* tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
static_call: Fix return type of static_call_init
tracepoint: Fix out of sync data passing by static caller
tracepoint: Fix overly long tracepoint names
x86/perf, static_call: Optimize x86_pmu methods
tracepoint: Optimize using static_call()
static_call: Allow early init
static_call: Add some validation
static_call: Handle tail-calls
static_call: Add static_call_cond()
x86/alternatives: Teach text_poke_bp() to emulate RET
static_call: Add simple self-test for static calls
x86/static_call: Add inline static call implementation for x86-64
x86/static_call: Add out-of-line static call implementation
static_call: Avoid kprobes on inline static_call()s
static_call: Add inline static call infrastructure
static_call: Add basic static call infrastructure
compiler.h: Make __ADDRESSABLE() symbol truly unique
jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved()
module: Properly propagate MODULE_STATE_COMING failure
module: Fix up module_notifier return values
...
09 Oct, 2020
1 commit
-
In order to make adding configurable features into seccomp easier,
it's better to have the options at one single location, considering
especially that the bulk of seccomp code is arch-independent. An quick
look also show that many SECCOMP descriptions are outdated; they talk
about /proc rather than prctl.As a result of moving the config option and keeping it default on,
architectures arm, arm64, csky, riscv, sh, and xtensa did not have SECCOMP
on by default prior to this and SECCOMP will be default in this change.Architectures microblaze, mips, powerpc, s390, sh, and sparc have an
outdated depend on PROC_FS and this dependency is removed in this change.Suggested-by: Jann Horn
Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@mail.gmail.com/
Signed-off-by: YiFei Zhu
[kees: added HAVE_ARCH_SECCOMP help text, tweaked wording]
Signed-off-by: Kees Cook
Link: https://lore.kernel.org/r/9ede6ef35c847e58d61e476c6a39540520066613.1600951211.git.yifeifz2@illinois.edu
06 Oct, 2020
1 commit
-
In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:On Fri, May 1, 2020 at 11:28 AM Linus Torvalds wrote:
>
> On Thu, Apr 30, 2020 at 6:21 PM Dan Williams wrote:
> >
> > However now I see that copy_user_generic() works for the wrong reason.
> > It works because the exception on the source address due to poison
> > looks no different than a write fault on the user address to the
> > caller, it's still just a short copy. So it makes copy_to_user() work
> > for the wrong reason relative to the name.
>
> Right.
>
> And it won't work that way on other architectures. On x86, we have a
> generic function that can take faults on either side, and we use it
> for both cases (and for the "in_user" case too), but that's an
> artifact of the architecture oddity.
>
> In fact, it's probably wrong even on x86 - because it can hide bugs -
> but writing those things is painful enough that everybody prefers
> having just one function.Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.[ bp: Massage a bit. ]
Signed-off-by: Dan Williams
Signed-off-by: Borislav Petkov
Reviewed-by: Tony Luck
Acked-by: Michael Ellerman
Cc:
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
08 Sep, 2020
1 commit
-
Install an exception handler for #VC exception that uses a GHCB. Also
add the infrastructure for handling different exit-codes by decoding
the instruction that caused the exception and error handling.Signed-off-by: Joerg Roedel
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200907131613.12703-24-joro@8bytes.org
01 Sep, 2020
2 commits
-
Add the inline static call implementation for x86-64. The generated code
is identical to the out-of-line case, except we move the trampoline into
it's own section.Objtool uses the trampoline naming convention to detect all the call
sites. It then annotates those call sites in the .static_call_sites
section.During boot (and module init), the call sites are patched to call
directly into the destination function. The temporary trampoline is
then no longer used.[peterz: merged trampolines, put trampoline in section]
Signed-off-by: Josh Poimboeuf
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Cc: Linus Torvalds
Link: https://lore.kernel.org/r/20200818135804.864271425@infradead.org -
Add the x86 out-of-line static call implementation. For each key, a
permanent trampoline is created which is the destination for all static
calls for the given key. The trampoline has a direct jump which gets
patched by static_call_update() when the destination function changes.[peterz: fixed trampoline, rewrote patching code]
Signed-off-by: Josh Poimboeuf
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Ingo Molnar
Cc: Linus Torvalds
Link: https://lore.kernel.org/r/20200818135804.804315175@infradead.org
15 Aug, 2020
1 commit
-
Pull more timer updates from Thomas Gleixner:
"A set of posix CPU timer changes which allows to defer the heavy work
of posix CPU timers into task work context. The tick interrupt is
reduced to a quick check which queues the work which is doing the
heavy lifting before returning to user space or going back to guest
mode. Moving this out is deferring the signal delivery slightly but
posix CPU timers are inaccurate by nature as they depend on the tick
so there is no real damage. The relevant test cases all passed.This lifts the last offender for RT out of the hard interrupt context
tick handler, but it also has the general benefit that the actual
heavy work is accounted to the task/process and not to the tick
interrupt itself.Further optimizations are possible to break long sighand lock hold and
interrupt disabled (on !RT kernels) times when a massive amount of
posix CPU timers (which are unpriviledged) is armed for a
task/process.This is currently only enabled for x86 because the architecture has to
ensure that task work is handled in KVM before entering a guest, which
was just established for x86 with the new common entry/exit code which
got merged post 5.8 and is not the case for other KVM architectures"* tag 'timers-core-2020-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Select POSIX_CPU_TIMERS_TASK_WORK
posix-cpu-timers: Provide mechanisms to defer timer handling to task_work
posix-cpu-timers: Split run_posix_cpu_timers()
07 Aug, 2020
1 commit
-
Pull KVM updates from Paolo Bonzini:
"s390:
- implement diag318x86:
- Report last CPU for debugging
- Emulate smaller MAXPHYADDR in the guest than in the host
- .noinstr and tracing fixes from Thomas
- nested SVM page table switching optimization and fixesGeneric:
- Unify shadow MMU cache data structures across architectures"* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
KVM: SVM: Fix sev_pin_memory() error handling
KVM: LAPIC: Set the TDCR settable bits
KVM: x86: Specify max TDP level via kvm_configure_mmu()
KVM: x86/mmu: Rename max_page_level to max_huge_page_level
KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
KVM: VMX: Make vmx_load_mmu_pgd() static
KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
KVM: VMX: Drop a duplicate declaration of construct_eptp()
KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
KVM: Using macros instead of magic values
MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
KVM: VMX: Add guest physical address check in EPT violation and misconfig
KVM: VMX: introduce vmx_need_pf_intercept
KVM: x86: update exception bitmap on CPUID changes
KVM: x86: rename update_bp_intercept to update_exception_bitmap
...
06 Aug, 2020
1 commit
-
Move POSIX CPU timer expiry and signal delivery into task context.
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Reviewed-by: Oleg Nesterov
Acked-by: Peter Zijlstra (Intel)
Link: https://lore.kernel.org/r/20200730102337.888613724@linutronix.de
05 Aug, 2020
3 commits
-
Pull x86 conversion to generic entry code from Thomas Gleixner:
"The conversion of X86 syscall, interrupt and exception entry/exit
handling to the generic code.Pretty much a straight-forward 1:1 conversion plus the consolidation
of the KVM handling of pending work before entering guest mode"* tag 'x86-entry-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kvm: Use __xfer_to_guest_mode_work_pending() in kvm_run_vcpu()
x86/kvm: Use generic xfer to guest work function
x86/entry: Cleanup idtentry_enter/exit
x86/entry: Use generic interrupt entry/exit code
x86/entry: Cleanup idtentry_entry/exit_user
x86/entry: Use generic syscall exit functionality
x86/entry: Use generic syscall entry function
x86/ptrace: Provide pt_regs helper for entry/exit
x86/entry: Move user return notifier out of loop
x86/entry: Consolidate 32/64 bit syscall entry
x86/entry: Consolidate check_user_regs()
x86: Correct noinstr qualifiers
x86/idtentry: Remove stale comment -
Pull dma-mapping updates from Christoph Hellwig:
- make support for dma_ops optional
- move more code out of line
- add generic support for a dma_ops bypass mode
- misc cleanups
* tag 'dma-mapping-5.9' of git://git.infradead.org/users/hch/dma-mapping:
dma-contiguous: cleanup dma_alloc_contiguous
dma-debug: use named initializers for dir2name
powerpc: use the generic dma_ops_bypass mode
dma-mapping: add a dma_ops_bypass flag to struct device
dma-mapping: make support for dma ops optional
dma-mapping: inline the fast path dma-direct calls
dma-mapping: move the remaining DMA API calls out of line -
Pull fork cleanups from Christian Brauner:
"This is cleanup series from when we reworked a chunk of the process
creation paths in the kernel and switched to struct
{kernel_}clone_args.High-level this does two main things:
- Remove the double export of both do_fork() and _do_fork() where
do_fork() used the incosistent legacy clone calling convention.Now we only export _do_fork() which is based on struct
kernel_clone_args.- Remove the copy_thread_tls()/copy_thread() split making the
architecture specific HAVE_COYP_THREAD_TLS config option obsolete.This switches all remaining architectures to select
HAVE_COPY_THREAD_TLS and thus to the copy_thread_tls() calling
convention. The current split makes the process creation codepaths
more convoluted than they need to be. Each architecture has their own
copy_thread() function unless it selects HAVE_COPY_THREAD_TLS then it
has a copy_thread_tls() function.The split is not needed anymore nowadays, all architectures support
CLONE_SETTLS but quite a few of them never bothered to select
HAVE_COPY_THREAD_TLS and instead simply continued to use copy_thread()
and use the old calling convention. Removing this split cleans up the
process creation codepaths and paves the way for implementing clone3()
on such architectures since it requires the copy_thread_tls() calling
convention.After having made each architectures support copy_thread_tls() this
series simply renames that function back to copy_thread(). It also
switches all architectures that call do_fork() directly over to
_do_fork() and the struct kernel_clone_args calling convention. This
is a corollary of switching the architectures that did not yet support
it over to copy_thread_tls() since do_fork() is conditional on not
supporting copy_thread_tls() (Mostly because it lacks a separate
argument for tls which is trivial to fix but there's no need for this
function to exist.).The do_fork() removal is in itself already useful as it allows to to
remove the export of both do_fork() and _do_fork() we currently have
in favor of only _do_fork(). This has already been discussed back when
we added clone3(). The legacy clone() calling convention is - as is
probably well-known - somewhat odd:#
# ABI hall of shame
#
config CLONE_BACKWARDS
config CLONE_BACKWARDS2
config CLONE_BACKWARDS3that is aggravated by the fact that some architectures such as sparc
follow the CLONE_BACKWARDSx calling convention but don't really select
the corresponding config option since they call do_fork() directly.So do_fork() enforces a somewhat arbitrary calling convention in the
first place that doesn't really help the individual architectures that
deviate from it. They can thus simply be switched to _do_fork()
enforcing a single calling convention. (I really hope that any new
architectures will __not__ try to implement their own calling
conventions...)Most architectures already have made a similar switch (m68k comes to
mind).Overall this removes more code than it adds even with a good portion
of added comments. It simplifies a chunk of arch specific assembly
either by moving the code into C or by simply rewriting the assembly.Architectures that have been touched in non-trivial ways have all been
actually boot and stress tested: sparc and ia64 have been tested with
Debian 9 images. They are the two architectures which have been
touched the most. All non-trivial changes to architectures have seen
acks from the relevant maintainers. nios2 with a custom built
buildroot image. h8300 I couldn't get something bootable to test on
but the changes have been fairly automatic and I'm sure we'll hear
people yell if I broke something there.All other architectures that have been touched in trivial ways have
been compile tested for each single patch of the series via git rebase
-x "make ..." v5.8-rc2. arm{64} and x86{_64} have been boot tested
even though they have just been trivially touched (removal of the
HAVE_COPY_THREAD_TLS macro from their Kconfig) because well they are
basically "core architectures" and since it is trivial to get your
hands on a useable image"* tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
arch: rename copy_thread_tls() back to copy_thread()
arch: remove HAVE_COPY_THREAD_TLS
unicore: switch to copy_thread_tls()
sh: switch to copy_thread_tls()
nds32: switch to copy_thread_tls()
microblaze: switch to copy_thread_tls()
hexagon: switch to copy_thread_tls()
c6x: switch to copy_thread_tls()
alpha: switch to copy_thread_tls()
fork: remove do_fork()
h8300: select HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
nios2: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
ia64: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
sparc: unconditionally enable HAVE_COPY_THREAD_TLS
sparc: share process creation helpers between sparc and sparc64
sparc64: enable HAVE_COPY_THREAD_TLS
fork: fold legacy_clone_args_valid() into _do_fork()
04 Aug, 2020
1 commit
-
Pull x86 microcode update from Ingo Molnar:
"Remove the microcode loader's FW_LOADER coupling"* tag 'x86-microcode-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode: Do not select FW_LOADER
31 Jul, 2020
1 commit
-
- Add support for zstd compressed kernel
- Define __DISABLE_EXPORTS in Makefile
- Remove __DISABLE_EXPORTS definition from kaslr.c
- Bump the heap size for zstd.
- Update the documentation.
Integrates the ZSTD decompression code to the x86 pre-boot code.
Zstandard requires slightly more memory during the kernel decompression
on x86 (192 KB vs 64 KB), and the memory usage is independent of the
window size.__DISABLE_EXPORTS is now defined in the Makefile, which covers both
the existing use in kaslr.c, and the use needed by the zstd decompressor
in misc.c.This patch has been boot tested with both a zstd and gzip compressed
kernel on i386 and x86_64 using buildroot and QEMU.Additionally, this has been tested in production on x86_64 devices.
We saw a 2 second boot time reduction by switching kernel compression
from xz to zstd.Signed-off-by: Nick Terrell
Signed-off-by: Ingo Molnar
Tested-by: Sedat Dilek
Reviewed-by: Kees Cook
Link: https://lore.kernel.org/r/20200730190841.2071656-7-nickrterrell@gmail.com
24 Jul, 2020
1 commit
-
Replace the syscall entry work handling with the generic version. Provide
the necessary helper inlines to handle the real architecture specific
parts, e.g. ptrace.Use a temporary define for idtentry_enter_user which will be cleaned up
seperately.Signed-off-by: Thomas Gleixner
Reviewed-by: Kees Cook
Link: https://lkml.kernel.org/r/20200722220520.376213694@linutronix.de
19 Jul, 2020
1 commit
-
Avoid the overhead of the dma ops support for tiny builds that only
use the direct mapping.Signed-off-by: Christoph Hellwig
Tested-by: Alexey Kardashevskiy
Reviewed-by: Alexey Kardashevskiy
09 Jul, 2020
1 commit
05 Jul, 2020
1 commit
-
All architectures support copy_thread_tls() now, so remove the legacy
copy_thread() function and the HAVE_COPY_THREAD_TLS config option. Everyone
uses the same process creation calling convention based on
copy_thread_tls() and struct kernel_clone_args. This will make it easier to
maintain the core process creation code under kernel/, simplifies the
callpaths and makes the identical for all architectures.Cc: linux-arch@vger.kernel.org
Acked-by: Thomas Bogendoerfer
Acked-by: Greentime Hu
Acked-by: Geert Uytterhoeven
Reviewed-by: Kees Cook
Signed-off-by: Christian Brauner
18 Jun, 2020
1 commit
-
Since many compilers cannot disable KCOV with a function attribute,
help it to NOP out any __sanitizer_cov_*() calls injected in noinstr
code.This turns:
12: e8 00 00 00 00 callq 17
13: R_X86_64_PLT32 __sanitizer_cov_trace_pc-0x4into:
12: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
13: R_X86_64_NONE __sanitizer_cov_trace_pc-0x4Just like recordmcount does.
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Dmitry Vyukov
15 Jun, 2020
2 commits
-
KVM now supports using interrupt for 'page ready' APF event delivery and
legacy mechanism was deprecated. Switch KVM guests to the new one.Signed-off-by: Vitaly Kuznetsov
Message-Id:
[Use HYPERVISOR_CALLBACK_VECTOR instead of a separate vector. - Paolo]
Signed-off-by: Paolo Bonzini -
The x86 microcode support works just fine without FW_LOADER. In fact,
these days most people load microcode early during boot so FW_LOADER
never gets into the picture anyway.As almost everyone on x86 needs to enable MICROCODE, this by extension
means that FW_LOADER is always built into the kernel even if nothing
uses it. The FW_LOADER system is about two thousand lines long and
contains user-space facing interfaces that could potentially provide an
entry point into the kernel (or beyond).Remove the unnecessary select of FW_LOADER by MICROCODE. People who need
the FW_LOADER capability can still enable it.[ bp: Massage a bit. ]
Signed-off-by: Herbert Xu
Signed-off-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20200610042911.GA20058@gondor.apana.org.au
14 Jun, 2020
3 commits
-
Pull more Kbuild updates from Masahiro Yamada:
- fix build rules in binderfs sample
- fix build errors when Kbuild recurses to the top Makefile
- covert '---help---' in Kconfig to 'help'
* tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
treewide: replace '---help---' in Kconfig files with 'help'
kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
samples: binderfs: really compile this sample and fix build issues -
Pull x86 entry updates from Thomas Gleixner:
"The x86 entry, exception and interrupt code reworkThis all started about 6 month ago with the attempt to move the Posix
CPU timer heavy lifting out of the timer interrupt code and just have
lockless quick checks in that code path. Trivial 5 patches.This unearthed an inconsistency in the KVM handling of task work and
the review requested to move all of this into generic code so other
architectures can share.Valid request and solved with another 25 patches but those unearthed
inconsistencies vs. RCU and instrumentation.Digging into this made it obvious that there are quite some
inconsistencies vs. instrumentation in general. The int3 text poke
handling in particular was completely unprotected and with the batched
update of trace events even more likely to expose to endless int3
recursion.In parallel the RCU implications of instrumenting fragile entry code
came up in several discussions.The conclusion of the x86 maintainer team was to go all the way and
make the protection against any form of instrumentation of fragile and
dangerous code pathes enforcable and verifiable by tooling.A first batch of preparatory work hit mainline with commit
d5f744f9a2ac ("Pull x86 entry code updates from Thomas Gleixner")That (almost) full solution introduced a new code section
'.noinstr.text' into which all code which needs to be protected from
instrumentation of all sorts goes into. Any call into instrumentable
code out of this section has to be annotated. objtool has support to
validate this.Kprobes now excludes this section fully which also prevents BPF from
fiddling with it and all 'noinstr' annotated functions also keep
ftrace off. The section, kprobes and objtool changes are already
merged.The major changes coming with this are:
- Preparatory cleanups
- Annotating of relevant functions to move them into the
noinstr.text section or enforcing inlining by marking them
__always_inline so the compiler cannot misplace or instrument
them.- Splitting and simplifying the idtentry macro maze so that it is
now clearly separated into simple exception entries and the more
interesting ones which use interrupt stacks and have the paranoid
handling vs. CR3 and GS.- Move quite some of the low level ASM functionality into C code:
- enter_from and exit to user space handling. The ASM code now
calls into C after doing the really necessary ASM handling and
the return path goes back out without bells and whistels in
ASM.- exception entry/exit got the equivivalent treatment
- move all IRQ tracepoints from ASM to C so they can be placed as
appropriate which is especially important for the int3
recursion issue.- Consolidate the declaration and definition of entry points between
32 and 64 bit. They share a common header and macros now.- Remove the extra device interrupt entry maze and just use the
regular exception entry code.- All ASM entry points except NMI are now generated from the shared
header file and the corresponding macros in the 32 and 64 bit
entry ASM.- The C code entry points are consolidated as well with the help of
DEFINE_IDTENTRY*() macros. This allows to ensure at one central
point that all corresponding entry points share the same
semantics. The actual function body for most entry points is in an
instrumentable and sane state.There are special macros for the more sensitive entry points, e.g.
INT3 and of course the nasty paranoid #NMI, #MCE, #DB and #DF.
They allow to put the whole entry instrumentation and RCU handling
into safe places instead of the previous pray that it is correct
approach.- The INT3 text poke handling is now completely isolated and the
recursion issue banned. Aside of the entry rework this required
other isolation work, e.g. the ability to force inline bsearch.- Prevent #DB on fragile entry code, entry relevant memory and
disable it on NMI, #MC entry, which allowed to get rid of the
nested #DB IST stack shifting hackery.- A few other cleanups and enhancements which have been made
possible through this and already merged changes, e.g.
consolidating and further restricting the IDT code so the IDT
table becomes RO after init which removes yet another popular
attack vector- About 680 lines of ASM maze are gone.
There are a few open issues:
- An escape out of the noinstr section in the MCE handler which needs
some more thought but under the aspect that MCE is a complete
trainwreck by design and the propability to survive it is low, this
was not high on the priority list.- Paravirtualization
When PV is enabled then objtool complains about a bunch of indirect
calls out of the noinstr section. There are a few straight forward
ways to fix this, but the other issues vs. general correctness were
more pressing than parawitz.- KVM
KVM is inconsistent as well. Patches have been posted, but they
have not yet been commented on or picked up by the KVM folks.- IDLE
Pretty much the same problems can be found in the low level idle
code especially the parts where RCU stopped watching. This was
beyond the scope of the more obvious and exposable problems and is
on the todo list.The lesson learned from this brain melting exercise to morph the
evolved code base into something which can be validated and understood
is that once again the violation of the most important engineering
principle "correctness first" has caused quite a few people to spend
valuable time on problems which could have been avoided in the first
place. The "features first" tinkering mindset really has to stop.With that I want to say thanks to everyone involved in contributing to
this effort. Special thanks go to the following people (alphabetical
order): Alexandre Chartre, Andy Lutomirski, Borislav Petkov, Brian
Gerst, Frederic Weisbecker, Josh Poimboeuf, Juergen Gross, Lai
Jiangshan, Macro Elver, Paolo Bonzin,i Paul McKenney, Peter Zijlstra,
Vitaly Kuznetsov, and Will Deacon"* tag 'x86-entry-2020-06-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (142 commits)
x86/entry: Force rcu_irq_enter() when in idle task
x86/entry: Make NMI use IDTENTRY_RAW
x86/entry: Treat BUG/WARN as NMI-like entries
x86/entry: Unbreak __irqentry_text_start/end magic
x86/entry: __always_inline CR2 for noinstr
lockdep: __always_inline more for noinstr
x86/entry: Re-order #DB handler to avoid *SAN instrumentation
x86/entry: __always_inline arch_atomic_* for noinstr
x86/entry: __always_inline irqflags for noinstr
x86/entry: __always_inline debugreg for noinstr
x86/idt: Consolidate idt functionality
x86/idt: Cleanup trap_init()
x86/idt: Use proper constants for table size
x86/idt: Add comments about early #PF handling
x86/idt: Mark init only functions __init
x86/entry: Rename trace_hardirqs_off_prepare()
x86/entry: Clarify irq_{enter,exit}_rcu()
x86/entry: Remove DBn stacks
x86/entry: Remove debug IDT frobbing
x86/entry: Optimize local_db_save() for virt
... -
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'In order to convert all of them to 1 tab + 'help', I ran the
following commend:$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada
13 Jun, 2020
1 commit
-
Pull more KVM updates from Paolo Bonzini:
"The guest side of the asynchronous page fault work has been delayed to
5.9 in order to sync with Thomas's interrupt entry rework, but here's
the rest of the KVM updates for this merge window.MIPS:
- Loongson portPPC:
- FixesARM:
- Fixesx86:
- KVM_SET_USER_MEMORY_REGION optimizations
- Fixes
- Selftest fixes"* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (62 commits)
KVM: x86: do not pass poisoned hva to __kvm_set_memory_region
KVM: selftests: fix sync_with_host() in smm_test
KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected
KVM: async_pf: Cleanup kvm_setup_async_pf()
kvm: i8254: remove redundant assignment to pointer s
KVM: x86: respect singlestep when emulating instruction
KVM: selftests: Don't probe KVM_CAP_HYPERV_ENLIGHTENED_VMCS when nested VMX is unsupported
KVM: selftests: do not substitute SVM/VMX check with KVM_CAP_NESTED_STATE check
KVM: nVMX: Consult only the "basic" exit reason when routing nested exit
KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h
KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts
KVM: arm64: Remove host_cpu_context member from vcpu structure
KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptr
KVM: arm64: Handle PtrAuth traps early
KVM: x86: Unexport x86_fpu_cache and make it static
KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4K
KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
KVM: arm64: Stop save/restoring ACTLR_EL1
KVM: arm64: Add emulation for 32bit guests accessing ACTLR2
...
12 Jun, 2020
1 commit
-
Merge the state of the locking kcsan branch before the read/write_once()
and the atomics modifications got merged.Squash the fallout of the rebase on top of the read/write once and atomic
fallback work into the merge. The history of the original branch is
preserved in tag locking-kcsan-2020-06-02.Signed-off-by: Thomas Gleixner
11 Jun, 2020
1 commit
-
Replace the extra interrupt handling code and reuse the existing idtentry
machinery. This moves the irq stack switching on 64-bit from ASM to C code;
32-bit already does the stack switching in C.This requires to remove HAVE_IRQ_EXIT_ON_IRQ_STACK as the stack switch is
not longer in the low level entry code.Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar
Acked-by: Andy Lutomirski
Link: https://lore.kernel.org/r/20200521202119.078690991@linutronix.de
07 Jun, 2020
1 commit
-
Pull dma-mapping updates from Christoph Hellwig:
- enhance the dma pool to allow atomic allocation on x86 with AMD SEV
(David Rientjes)- two small cleanups (Jason Yan and Peter Collingbourne)
* tag 'dma-mapping-5.8' of git://git.infradead.org/users/hch/dma-mapping:
dma-contiguous: fix comment for dma_release_from_contiguous
dma-pool: scale the default DMA coherent pool size with memory capacity
x86/mm: unencrypted non-blocking DMA allocations use coherent pools
dma-pool: add pool sizes to debugfs
dma-direct: atomic allocations must come from atomic coherent pools
dma-pool: dynamically expanding atomic pools
dma-pool: add additional coherent pools to map to gfp mask
dma-remap: separate DMA atomic pools from direct remap code
dma-debug: make __dma_entry_alloc_check_leak() static
05 Jun, 2020
2 commits
-
This adds tests which will validate architecture page table helpers and
other accessors in their compliance with expected generic MM semantics.
This will help various architectures in validating changes to existing
page table helpers or addition of new ones.This test covers basic page table entry transformations including but not
limited to old, young, dirty, clean, write, write protect etc at various
level along with populating intermediate entries with next page table page
and validating them.Test page table pages are allocated from system memory with required size
and alignments. The mapped pfns at page table levels are derived from a
real pfn representing a valid kernel text symbol. This test gets called
via late_initcall().This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected.
Any architecture, which is willing to subscribe this test will need to
select ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64,
x86, s390 and powerpc platforms where the test is known to build and run
successfully Going forward, other architectures too can subscribe the test
after fixing any build or runtime problems with their page table helpers.Folks interested in making sure that a given platform's page table helpers
conform to expected generic MM semantics should enable the above config
which will just trigger this test during boot. Any non conformity here
will be reported as an warning which would need to be fixed. This test
will help catch any changes to the agreed upon semantics expected from
generic MM and enable platforms to accommodate it thereafter.[anshuman.khandual@arm.com: v17]
Link: http://lkml.kernel.org/r/1587436495-22033-3-git-send-email-anshuman.khandual@arm.com
[anshuman.khandual@arm.com: v18]
Link: http://lkml.kernel.org/r/1588564865-31160-3-git-send-email-anshuman.khandual@arm.com
Suggested-by: Catalin Marinas
Signed-off-by: Anshuman Khandual
Signed-off-by: Christophe Leroy
Signed-off-by: Qian Cai
Signed-off-by: Andrew Morton
Tested-by: Gerald Schaefer [s390]
Tested-by: Christophe Leroy [ppc32]
Reviewed-by: Ingo Molnar
Cc: Mike Rapoport
Cc: Vineet Gupta
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Heiko Carstens
Cc: Vasily Gorbik
Cc: Christian Borntraeger
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: "H. Peter Anvin"
Cc: Kirill A. Shutemov
Cc: Paul Walmsley
Cc: Palmer Dabbelt
Link: http://lkml.kernel.org/r/1583919272-24178-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds -
Remove KVM_DEBUG_FS, which can easily be misconstrued as controlling
KVM-as-a-host. The sole user of CONFIG_KVM_DEBUG_FS was removed by
commit cfd8983f03c7b ("x86, locking/spinlocks: Remove ticket (spin)lock
implementation").Signed-off-by: Sean Christopherson
Message-Id:
Signed-off-by: Paolo Bonzini
04 Jun, 2020
3 commits
-
Merge more updates from Andrew Morton:
"More mm/ work, plenty more to comeSubsystems affected by this patch series: slub, memcg, gup, kasan,
pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs,
thp, mmap, kconfig"* akpm: (131 commits)
arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined
x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined
riscv: support DEBUG_WX
mm: add DEBUG_WX support
drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup
mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid()
powerpc/mm: drop platform defined pmd_mknotpresent()
mm: thp: don't need to drain lru cache when splitting and mlocking THP
hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs
sparc32: register memory occupied by kernel as memblock.memory
include/linux/memblock.h: fix minor typo and unclear comment
mm, mempolicy: fix up gup usage in lookup_node
tools/vm/page_owner_sort.c: filter out unneeded line
mm: swap: memcg: fix memcg stats for huge pages
mm: swap: fix vmstats for huge pages
mm: vmscan: limit the range of LRU type balancing
mm: vmscan: reclaim writepage is IO cost
mm: vmscan: determine anon/file pressure balance at the reclaim root
mm: balance LRU lists based on relative thrashing
mm: only count actual rotations as LRU reclaim cost
... -
Extract DEBUG_WX to mm/Kconfig.debug for shared use. Change to use
ARCH_HAS_DEBUG_WX instead of DEBUG_WX defined by arch port.Signed-off-by: Zong Li
Signed-off-by: Andrew Morton
Cc: Borislav Petkov
Cc: Catalin Marinas
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Palmer Dabbelt
Cc: Paul Walmsley
Cc: Thomas Gleixner
Cc: Will Deacon
Link: http://lkml.kernel.org/r/430736828d149df3f5b462d291e845ec690e0141.1587455584.git.zong.li@sifive.com
Signed-off-by: Linus Torvalds -
The memmap_init() function was made to iterate over memblock regions and
as the result the early_pfn_in_nid() function became obsolete. Since
CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real
implementation of early_pfn_in_nid(), it is also not needed anymore.Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES.
Co-developed-by: Hoan Tran
Signed-off-by: Hoan Tran
Signed-off-by: Mike Rapoport
Signed-off-by: Andrew Morton
Tested-by: Hoan Tran [arm64]
Cc: Baoquan He
Cc: Brian Cain
Cc: Catalin Marinas
Cc: "David S. Miller"
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Ungerer
Cc: Guan Xuetao
Cc: Guo Ren
Cc: Heiko Carstens
Cc: Helge Deller
Cc: "James E.J. Bottomley"
Cc: Jonathan Corbet
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Matt Turner
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Hocko
Cc: Michal Simek
Cc: Nick Hu
Cc: Paul Walmsley
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Russell King
Cc: Stafford Horne
Cc: Thomas Bogendoerfer
Cc: Tony Luck
Cc: Vineet Gupta
Cc: Yoshinori Sato
Link: http://lkml.kernel.org/r/20200412194859.12663-17-rppt@kernel.org
Signed-off-by: Linus Torvalds