01 Apr, 2020

11 commits

  • Every remaining user just has the error case returning -EFAULT.

    In fact, the exception was __get_user_asm_nozero(), which was removed in
    commit 4b842e4e25b1 ("x86: get rid of small constant size cases in
    raw_copy_{to,from}_user()"), and the other __get_user_xyz() macros just
    followed suit for consistency.

    Fix up some macro whitespace while at it.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • The last user was removed by commit 4b842e4e25b1 ("x86: get rid of small
    constant size cases in raw_copy_{to,from}_user()"). Get rid of the
    left-overs before somebody tries to use it again.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull networking updates from David Miller:
    "Highlights:

    1) Fix the iwlwifi regression, from Johannes Berg.

    2) Support BSS coloring and 802.11 encapsulation offloading in
    hardware, from John Crispin.

    3) Fix some potential Spectre issues in qtnfmac, from Sergey
    Matyukevich.

    4) Add TTL decrement action to openvswitch, from Matteo Croce.

    5) Allow paralleization through flow_action setup by not taking the
    RTNL mutex, from Vlad Buslov.

    6) A lot of zero-length array to flexible-array conversions, from
    Gustavo A. R. Silva.

    7) Align XDP statistics names across several drivers for consistency,
    from Lorenzo Bianconi.

    8) Add various pieces of infrastructure for offloading conntrack, and
    make use of it in mlx5 driver, from Paul Blakey.

    9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki.

    10) Lots of parallelization improvements during configuration changes
    in mlxsw driver, from Ido Schimmel.

    11) Add support to devlink for generic packet traps, which report
    packets dropped during ACL processing. And use them in mlxsw
    driver. From Jiri Pirko.

    12) Support bcmgenet on ACPI, from Jeremy Linton.

    13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei
    Starovoitov, and your's truly.

    14) Support XDP meta-data in virtio_net, from Yuya Kusakabe.

    15) Fix sysfs permissions when network devices change namespaces, from
    Christian Brauner.

    16) Add a flags element to ethtool_ops so that drivers can more simply
    indicate which coalescing parameters they actually support, and
    therefore the generic layer can validate the user's ethtool
    request. Use this in all drivers, from Jakub Kicinski.

    17) Offload FIFO qdisc in mlxsw, from Petr Machata.

    18) Support UDP sockets in sockmap, from Lorenz Bauer.

    19) Fix stretch ACK bugs in several TCP congestion control modules,
    from Pengcheng Yang.

    20) Support virtual functiosn in octeontx2 driver, from Tomasz
    Duszynski.

    21) Add region operations for devlink and use it in ice driver to dump
    NVM contents, from Jacob Keller.

    22) Add support for hw offload of MACSEC, from Antoine Tenart.

    23) Add support for BPF programs that can be attached to LSM hooks,
    from KP Singh.

    24) Support for multiple paths, path managers, and counters in MPTCP.
    From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti,
    and others.

    25) More progress on adding the netlink interface to ethtool, from
    Michal Kubecek"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits)
    net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline
    cxgb4/chcr: nic-tls stats in ethtool
    net: dsa: fix oops while probing Marvell DSA switches
    net/bpfilter: remove superfluous testing message
    net: macb: Fix handling of fixed-link node
    net: dsa: ksz: Select KSZ protocol tag
    netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write
    net: stmmac: add EHL 2.5Gbps PCI info and PCI ID
    net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID
    net: stmmac: create dwmac-intel.c to contain all Intel platform
    net: dsa: bcm_sf2: Support specifying VLAN tag egress rule
    net: dsa: bcm_sf2: Add support for matching VLAN TCI
    net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions
    net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT
    net: dsa: bcm_sf2: Disable learning for ASP port
    net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge
    net: dsa: b53: Prevent tagged VLAN on port 7 for 7278
    net: dsa: b53: Restore VLAN entries upon (re)configuration
    net: dsa: bcm_sf2: Fix overflow checks
    hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
    ...

    Linus Torvalds
     
  • Pull Kbuild updates from Masahiro Yamada:
    "Build system:

    - add CONFIG_UNUSED_KSYMS_WHITELIST, which will be useful to define a
    fixed set of export symbols for Generic Kernel Image (GKI)

    - allow to run 'make dt_binding_check' without .config

    - use full schema for checking DT examples in *.yaml files

    - make modpost fail for missing MODULE_IMPORT_NS(), which makes more
    sense because we know the produced modules are never loadable

    - Remove unused 'AS' variable

    Kconfig:

    - sanitize DEFCONFIG_LIST, and remove ARCH_DEFCONFIG from Kconfig
    files

    - relax the 'imply' behavior so that symbols implied by 'y' can
    become 'm'

    - make 'imply' obey 'depends on' in order to make 'imply' really weak

    Misc:

    - add documentation on building the kernel with Clang/LLVM

    - revive __HAVE_ARCH_STRLEN for 32bit sparc to use optimized strlen()

    - fix warning from deb-pkg builds when CONFIG_DEBUG_INFO=n

    - various script and Makefile cleanups"

    * tag 'kbuild-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
    Makefile: Update kselftest help information
    kbuild: deb-pkg: fix warning when CONFIG_DEBUG_INFO is unset
    kbuild: add outputmakefile to no-dot-config-targets
    kbuild: remove AS variable
    net: wan: wanxl: refactor the firmware rebuild rule
    net: wan: wanxl: use $(M68KCC) instead of $(M68KAS) for rebuilding firmware
    net: wan: wanxl: use allow to pass CROSS_COMPILE_M68k for rebuilding firmware
    kbuild: add comment about grouped target
    kbuild: add -Wall to KBUILD_HOSTCXXFLAGS
    kconfig: remove unused variable in qconf.cc
    sparc: revive __HAVE_ARCH_STRLEN for 32bit sparc
    kbuild: refactor Makefile.dtbinst more
    kbuild: compute the dtbs_install destination more simply
    Makefile: disallow data races on gcc-10 as well
    kconfig: make 'imply' obey the direct dependency
    kconfig: allow symbols implied by y to become m
    net: drop_monitor: use IS_REACHABLE() to guard net_dm_hw_report()
    modpost: return error if module is missing ns imports and MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS=n
    modpost: rework and consolidate logging interface
    kbuild: allow to run dt_binding_check without kernel configuration
    ...

    Linus Torvalds
     
  • Pull x86 vmware updates from Ingo Molnar:
    "The main change in this tree is the addition of 'steal time clock
    support' for VMware guests"

    * 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/vmware: Use bool type for vmw_sched_clock
    x86/vmware: Enable steal time accounting
    x86/vmware: Add steal time clock support for VMware guests
    x86/vmware: Remove vmware_sched_clock_setup()
    x86/vmware: Make vmware_select_hypercall() __init

    Linus Torvalds
     
  • Pull x86 mm updates from Ingo Molnar:
    "A handful of changes:

    - two memory encryption related fixes

    - don't display the kernel's virtual memory layout plaintext on
    32-bit kernels either

    - two simplifications"

    * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mm: Remove the now redundant N_MEMORY check
    dma-mapping: Fix dma_pgprot() for unencrypted coherent pages
    x86: Don't let pgprot_modify() change the page encryption bit
    x86/mm/kmmio: Use this_cpu_ptr() instead get_cpu_var() for kmmio_ctx
    x86/mm/init/32: Stop printing the virtual memory layout

    Linus Torvalds
     
  • Pull misc x86 updates from Ingo Molnar:

    - extend the decoder maps with CET instructions

    - fix !vDSO corner cases

    * 'x86-misc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/tests: Add CET instructions to the new instructions test
    x86/insn: Add Control-flow Enforcement (CET) instructions to the opcode map
    selftests/x86/ptrace_syscall_32: Fix no-vDSO segfault
    selftests/x86/vdso: Fix no-vDSO segfaults

    Linus Torvalds
     
  • Pull x86 fpu updates from Ingo Molnar:
    "Misc changes:

    - add a pkey sanity check

    - three commits to improve and future-proof xstate/xfeature handling
    some more"

    * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/pkeys: Add check for pkey "overflow"
    x86/fpu/xstate: Warn when checking alignment of disabled xfeatures
    x86/fpu/xstate: Fix XSAVES offsets in setup_xstate_comp()
    x86/fpu/xstate: Fix last_good_offset in setup_xstate_features()

    Linus Torvalds
     
  • Pull x86 cleanups from Ingo Molnar:
    "This topic tree contains more commits than usual:

    - most of it are uaccess cleanups/reorganization by Al

    - there's a bunch of prototype declaration (--Wmissing-prototypes)
    cleanups

    - misc other cleanups all around the map"

    * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    x86/mm/set_memory: Fix -Wmissing-prototypes warnings
    x86/efi: Add a prototype for efi_arch_mem_reserve()
    x86/mm: Mark setup_emu2phys_nid() static
    x86/jump_label: Move 'inline' keyword placement
    x86/platform/uv: Add a missing prototype for uv_bau_message_interrupt()
    kill uaccess_try()
    x86: unsafe_put-style macro for sigmask
    x86: x32_setup_rt_frame(): consolidate uaccess areas
    x86: __setup_rt_frame(): consolidate uaccess areas
    x86: __setup_frame(): consolidate uaccess areas
    x86: setup_sigcontext(): list user_access_{begin,end}() into callers
    x86: get rid of put_user_try in __setup_rt_frame() (both 32bit and 64bit)
    x86: ia32_setup_rt_frame(): consolidate uaccess areas
    x86: ia32_setup_frame(): consolidate uaccess areas
    x86: ia32_setup_sigcontext(): lift user_access_{begin,end}() into the callers
    x86/alternatives: Mark text_poke_loc_init() static
    x86/cpu: Fix a -Wmissing-prototypes warning for init_ia32_feat_ctl()
    x86/mm: Drop pud_mknotpresent()
    x86: Replace setup_irq() by request_irq()
    x86/configs: Slightly reduce defconfigs
    ...

    Linus Torvalds
     
  • Pull x86 build updates from Ingo Molnar:
    "A handful of updates: two linker script cleanups and a stock
    defconfig+allmodconfig bootability fix"

    * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/vdso: Discard .note.gnu.property sections in vDSO
    x86, vmlinux.lds: Add RUNTIME_DISCARD_EXIT to generic DISCARDS
    x86/Kconfig: Make CMDLINE_OVERRIDE depend on non-empty CMDLINE

    Linus Torvalds
     
  • Pull x86 boot updates from Ingo Molnar:
    "Misc cleanups and small enhancements all around the map"

    * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/boot/compressed: Fix debug_puthex() parameter type
    x86/setup: Fix static memory detection
    x86/vmlinux: Drop unneeded linker script discard of .eh_frame
    x86/*/Makefile: Use -fno-asynchronous-unwind-tables to suppress .eh_frame sections
    x86/boot/compressed: Remove .eh_frame section from bzImage
    x86/boot/compressed/64: Remove .bss/.pgtable from bzImage
    x86/boot/compressed/64: Use 32-bit (zero-extended) MOV for z_output_len
    x86/boot/compressed/64: Use LEA to initialize boot stack pointer

    Linus Torvalds
     

31 Mar, 2020

16 commits

  • Pull x86 timer updates from Thomas Gleixner:
    "A series of commits to make the MSR derived CPU and TSC frequency more
    accurate.

    It turned out that the frequency tables which have been taken from the
    SDM are inaccurate because the SDM provides truncated and rounded
    values, e.g. 83.3Mhz (83.3333...) or 116.7Mhz (116.6666...).

    This causes time drift in the range of ~1 second per hour (20-30
    seconds per day). On some of these SoCs it's not possible to
    recalibrate the TSC because there is no reference (PIT, HPET)
    available.

    With some reverse engineering it was established that the possible
    frequencies are derived from the base clock with fixed multiplier /
    divider pairs.

    For the CPU models which have a known crystal frequency the kernel now
    uses multiplier / divider pairs which bring the frequencies closer to
    reality and fix the observed time drift issues"

    * tag 'x86-timers-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/tsc_msr: Make MSR derived TSC frequency more accurate
    x86/tsc_msr: Fix MSR_FSB_FREQ mask for Cherry Trail devices
    x86/tsc_msr: Use named struct initializers

    Linus Torvalds
     
  • Pull x86 splitlock updates from Thomas Gleixner:
    "Support for 'split lock' detection:

    Atomic operations (lock prefixed instructions) which span two cache
    lines have to acquire the global bus lock. This is at least 1k cycles
    slower than an atomic operation within a cache line and disrupts
    performance on other cores. Aside of performance disruption this is a
    unpriviledged form of DoS.

    Some newer CPUs have the capability to raise an #AC trap when such an
    operation is attempted. The detection is by default enabled in warning
    mode which will warn once when a user space application is caught. A
    command line option allows to disable the detection or to select fatal
    mode which will terminate offending applications with SIGBUS"

    * tag 'x86-splitlock-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
    x86/split_lock: Rework the initialization flow of split lock detection
    x86/split_lock: Enable split lock detection by kernel

    Linus Torvalds
     
  • Pull x86 entry code updates from Thomas Gleixner:

    - Convert the 32bit syscalls to be pt_regs based which removes the
    requirement to push all 6 potential arguments onto the stack and
    consolidates the interface with the 64bit variant

    - The first small portion of the exception and syscall related entry
    code consolidation which aims to address the recently discovered
    issues vs. RCU, int3, NMI and some other exceptions which can
    interrupt any context. The bulk of the changes is still work in
    progress and aimed for 5.8.

    - A few lockdep namespace cleanups which have been applied into this
    branch to keep the prerequisites for the ongoing work confined.

    * tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits)
    x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS
    lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}()
    lockdep: Rename trace_softirqs_{on,off}()
    lockdep: Rename trace_hardirq_{enter,exit}()
    x86/entry: Rename ___preempt_schedule
    x86: Remove unneeded includes
    x86/entry: Drop asmlinkage from syscalls
    x86/entry/32: Enable pt_regs based syscalls
    x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments
    x86/entry/32: Rename 32-bit specific syscalls
    x86/entry/32: Clean up syscall_32.tbl
    x86/entry: Remove ABI prefixes from functions in syscall tables
    x86/entry/64: Add __SYSCALL_COMMON()
    x86/entry: Remove syscall qualifier support
    x86/entry/64: Remove ptregs qualifier from syscall table
    x86/entry: Move max syscall number calculation to syscallhdr.sh
    x86/entry/64: Split X32 syscall table into its own file
    x86/entry/64: Move sys_ni_syscall stub to common.c
    x86/entry/64: Use syscall wrappers for x32_rt_sigreturn
    x86/entry: Refactor SYS_NI macros
    ...

    Linus Torvalds
     
  • Pull timekeeping and timer updates from Thomas Gleixner:
    "Core:

    - Consolidation of the vDSO build infrastructure to address the
    difficulties of cross-builds for ARM64 compat vDSO libraries by
    restricting the exposure of header content to the vDSO build.

    This is achieved by splitting out header content into separate
    headers. which contain only the minimaly required information which
    is necessary to build the vDSO. These new headers are included from
    the kernel headers and the vDSO specific files.

    - Enhancements to the generic vDSO library allowing more fine grained
    control over the compiled in code, further reducing architecture
    specific storage and preparing for adopting the generic library by
    PPC.

    - Cleanup and consolidation of the exit related code in posix CPU
    timers.

    - Small cleanups and enhancements here and there

    Drivers:

    - The obligatory new drivers: Ingenic JZ47xx and X1000 TCU support

    - Correct the clock rate of PIT64b global clock

    - setup_irq() cleanup

    - Preparation for PWM and suspend support for the TI DM timer

    - Expand the fttmr010 driver to support ast2600 systems

    - The usual small fixes, enhancements and cleanups all over the
    place"

    * tag 'timers-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
    Revert "clocksource/drivers/timer-probe: Avoid creating dead devices"
    vdso: Fix clocksource.h macro detection
    um: Fix header inclusion
    arm64: vdso32: Enable Clang Compilation
    lib/vdso: Enable common headers
    arm: vdso: Enable arm to use common headers
    x86/vdso: Enable x86 to use common headers
    mips: vdso: Enable mips to use common headers
    arm64: vdso32: Include common headers in the vdso library
    arm64: vdso: Include common headers in the vdso library
    arm64: Introduce asm/vdso/processor.h
    arm64: vdso32: Code clean up
    linux/elfnote.h: Replace elf.h with UAPI equivalent
    scripts: Fix the inclusion order in modpost
    common: Introduce processor.h
    linux/ktime.h: Extract common header for vDSO
    linux/jiffies.h: Extract common header for vDSO
    linux/time64.h: Extract common header for vDSO
    linux/time32.h: Extract common header for vDSO
    linux/time.h: Extract common header for vDSO
    ...

    Linus Torvalds
     
  • Pull NOHZ update from Thomas Gleixner:
    "Remove TIF_NOHZ from three architectures

    These architectures use a static key to decide whether context
    tracking needs to be invoked and the TIF_NOHZ flag just causes a
    pointless slowpath execution for nothing"

    * tag 'timers-nohz-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    arm64: Remove TIF_NOHZ
    arm: Remove TIF_NOHZ
    x86: Remove TIF_NOHZ
    context-tracking: Introduce CONFIG_HAVE_TIF_NOHZ
    x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY

    Linus Torvalds
     
  • Pull core SMP updates from Thomas Gleixner:
    "CPU (hotplug) updates:

    - Support for locked CSD objects in smp_call_function_single_async()
    which allows to simplify callsites in the scheduler core and MIPS

    - Treewide consolidation of CPU hotplug functions which ensures the
    consistency between the sysfs interface and kernel state. The low
    level functions cpu_up/down() are now confined to the core code and
    not longer accessible from random code"

    * tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
    cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus()
    cpu/hotplug: Hide cpu_up/down()
    cpu/hotplug: Move bringup of secondary CPUs out of smp_init()
    torture: Replace cpu_up/down() with add/remove_cpu()
    firmware: psci: Replace cpu_up/down() with add/remove_cpu()
    xen/cpuhotplug: Replace cpu_up/down() with device_online/offline()
    parisc: Replace cpu_up/down() with add/remove_cpu()
    sparc: Replace cpu_up/down() with add/remove_cpu()
    powerpc: Replace cpu_up/down() with add/remove_cpu()
    x86/smp: Replace cpu_up/down() with add/remove_cpu()
    arm64: hibernate: Use bringup_hibernate_cpu()
    cpu/hotplug: Provide bringup_hibernate_cpu()
    arm64: Use reboot_cpu instead of hardconding it to 0
    arm64: Don't use disable_nonboot_cpus()
    ARM: Use reboot_cpu instead of hardcoding it to 0
    ARM: Don't use disable_nonboot_cpus()
    ia64: Replace cpu_down() with smp_shutdown_nonboot_cpus()
    cpu/hotplug: Create a new function to shutdown nonboot cpus
    cpu/hotplug: Add new {add,remove}_cpu() functions
    sched/core: Remove rq.hrtick_csd_pending
    ...

    Linus Torvalds
     
  • Pull irq updates from Thomas Gleixner:
    "Updates for the interrupt subsystem:

    Treewide:

    - Cleanup of setup_irq() which is not longer required because the
    memory allocator is available early.

    Most cleanup changes come through the various maintainer trees, so
    the final removal of setup_irq() is postponed towards the end of
    the merge window.

    Core:

    - Protection against unsafe invocation of interrupt handlers and
    unsafe interrupt injection including a fixup of the offending
    PCI/AER error injection mechanism.

    Invoking interrupt handlers from arbitrary contexts, i.e. outside
    of an actual interrupt, can cause inconsistent state on the
    fragile x86 interrupt affinity changing hardware trainwreck.

    Drivers:

    - Second wave of support for the new ARM GICv4.1

    - Multi-instance support for Xilinx and PLIC interrupt controllers

    - CPU-Hotplug support for PLIC

    - The obligatory new driver for X1000 TCU

    - Enhancements, cleanups and fixes all over the place"

    * tag 'irq-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
    unicore32: Replace setup_irq() by request_irq()
    sh: Replace setup_irq() by request_irq()
    hexagon: Replace setup_irq() by request_irq()
    c6x: Replace setup_irq() by request_irq()
    alpha: Replace setup_irq() by request_irq()
    irqchip/gic-v4.1: Eagerly vmap vPEs
    irqchip/gic-v4.1: Add VSGI property setup
    irqchip/gic-v4.1: Add VSGI allocation/teardown
    irqchip/gic-v4.1: Move doorbell management to the GICv4 abstraction layer
    irqchip/gic-v4.1: Plumb set_vcpu_affinity SGI callbacks
    irqchip/gic-v4.1: Plumb get/set_irqchip_state SGI callbacks
    irqchip/gic-v4.1: Plumb mask/unmask SGI callbacks
    irqchip/gic-v4.1: Add initial SGI configuration
    irqchip/gic-v4.1: Plumb skeletal VSGI irqchip
    irqchip/stm32: Retrigger both in eoi and unmask callbacks
    irqchip/gic-v3: Move irq_domain_update_bus_token to after checking for NULL domain
    irqchip/xilinx: Do not call irq_set_default_host()
    irqchip/xilinx: Enable generic irq multi handler
    irqchip/xilinx: Fill error code when irq domain registration fails
    irqchip/xilinx: Add support for multiple instances
    ...

    Linus Torvalds
     
  • Pull scheduler updates from Ingo Molnar:
    "The main changes in this cycle are:

    - Various NUMA scheduling updates: harmonize the load-balancer and
    NUMA placement logic to not work against each other. The intended
    result is better locality, better utilization and fewer migrations.

    - Introduce Thermal Pressure tracking and optimizations, to improve
    task placement on thermally overloaded systems.

    - Implement frequency invariant scheduler accounting on (some) x86
    CPUs. This is done by observing and sampling the 'recent' CPU
    frequency average at ~tick boundaries. The CPU provides this data
    via the APERF/MPERF MSRs. This hopefully makes our capacity
    estimates more precise and keeps tasks on the same CPU better even
    if it might seem overloaded at a lower momentary frequency. (As
    usual, turbo mode is a complication that we resolve by observing
    the maximum frequency and renormalizing to it.)

    - Add asymmetric CPU capacity wakeup scan to improve capacity
    utilization on asymmetric topologies. (big.LITTLE systems)

    - PSI fixes and optimizations.

    - RT scheduling capacity awareness fixes & improvements.

    - Optimize the CONFIG_RT_GROUP_SCHED constraints code.

    - Misc fixes, cleanups and optimizations - see the changelog for
    details"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits)
    threads: Update PID limit comment according to futex UAPI change
    sched/fair: Fix condition of avg_load calculation
    sched/rt: cpupri_find: Trigger a full search as fallback
    kthread: Do not preempt current task if it is going to call schedule()
    sched/fair: Improve spreading of utilization
    sched: Avoid scale real weight down to zero
    psi: Move PF_MEMSTALL out of task->flags
    MAINTAINERS: Add maintenance information for psi
    psi: Optimize switching tasks inside shared cgroups
    psi: Fix cpu.pressure for cpu.max and competing cgroups
    sched/core: Distribute tasks within affinity masks
    sched/fair: Fix enqueue_task_fair warning
    thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code
    sched/rt: Remove unnecessary push for unfit tasks
    sched/rt: Allow pulling unfitting task
    sched/rt: Optimize cpupri_find() on non-heterogenous systems
    sched/rt: Re-instate old behavior in select_task_rq_rt()
    sched/rt: cpupri_find: Implement fallback mechanism for !fit case
    sched/fair: Fix reordering of enqueue/dequeue_task_fair()
    sched/fair: Fix runnable_avg for throttled cfs
    ...

    Linus Torvalds
     
  • Pull perf updates from Ingo Molnar:
    "The main changes in this cycle were:

    Kernel side changes:

    - A couple of x86/cpu cleanups and changes were grandfathered in due
    to patch dependencies. These clean up the set of CPU model/family
    matching macros with a consistent namespace and C99 initializer
    style.

    - A bunch of updates to various low level PMU drivers:
    * AMD Family 19h L3 uncore PMU
    * Intel Tiger Lake uncore support
    * misc fixes to LBR TOS sampling

    - optprobe fixes

    - perf/cgroup: optimize cgroup event sched-in processing

    - misc cleanups and fixes

    Tooling side changes are to:

    - perf {annotate,expr,record,report,stat,test}

    - perl scripting

    - libapi, libperf and libtraceevent

    - vendor events on Intel and S390, ARM cs-etm

    - Intel PT updates

    - Documentation changes and updates to core facilities

    - misc cleanups, fixes and other enhancements"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (89 commits)
    cpufreq/intel_pstate: Fix wrong macro conversion
    x86/cpu: Cleanup the now unused CPU match macros
    hwrng: via_rng: Convert to new X86 CPU match macros
    crypto: Convert to new CPU match macros
    ASoC: Intel: Convert to new X86 CPU match macros
    powercap/intel_rapl: Convert to new X86 CPU match macros
    PCI: intel-mid: Convert to new X86 CPU match macros
    mmc: sdhci-acpi: Convert to new X86 CPU match macros
    intel_idle: Convert to new X86 CPU match macros
    extcon: axp288: Convert to new X86 CPU match macros
    thermal: Convert to new X86 CPU match macros
    hwmon: Convert to new X86 CPU match macros
    platform/x86: Convert to new CPU match macros
    EDAC: Convert to new X86 CPU match macros
    cpufreq: Convert to new X86 CPU match macros
    ACPI: Convert to new X86 CPU match macros
    x86/platform: Convert to new CPU match macros
    x86/kernel: Convert to new CPU match macros
    x86/kvm: Convert to new CPU match macros
    x86/perf/events: Convert to new CPU match macros
    ...

    Linus Torvalds
     
  • Pull locking updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Continued user-access cleanups in the futex code.

    - percpu-rwsem rewrite that uses its own waitqueue and atomic_t
    instead of an embedded rwsem. This addresses a couple of
    weaknesses, but the primary motivation was complications on the -rt
    kernel.

    - Introduce raw lock nesting detection on lockdep
    (CONFIG_PROVE_RAW_LOCK_NESTING=y), document the raw_lock vs. normal
    lock differences. This too originates from -rt.

    - Reuse lockdep zapped chain_hlocks entries, to conserve RAM
    footprint on distro-ish kernels running into the "BUG:
    MAX_LOCKDEP_CHAIN_HLOCKS too low!" depletion of the lockdep
    chain-entries pool.

    - Misc cleanups, smaller fixes and enhancements - see the changelog
    for details"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
    fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t
    thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t
    Documentation/locking/locktypes: Minor copy editor fixes
    Documentation/locking/locktypes: Further clarifications and wordsmithing
    m68knommu: Remove mm.h include from uaccess_no.h
    x86: get rid of user_atomic_cmpxchg_inatomic()
    generic arch_futex_atomic_op_inuser() doesn't need access_ok()
    x86: don't reload after cmpxchg in unsafe_atomic_op2() loop
    x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end()
    objtool: whitelist __sanitizer_cov_trace_switch()
    [parisc, s390, sparc64] no need for access_ok() in futex handling
    sh: no need of access_ok() in arch_futex_atomic_op_inuser()
    futex: arch_futex_atomic_op_inuser() calling conventions change
    completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all()
    lockdep: Add posixtimer context tracing bits
    lockdep: Annotate irq_work
    lockdep: Add hrtimer context tracing bits
    lockdep: Introduce wait-type checks
    completion: Use simple wait queues
    sched/swait: Prepare usage in completions
    ...

    Linus Torvalds
     
  • Pull EFI updates from Ingo Molnar:
    "The EFI changes in this cycle are much larger than usual, for two
    (positive) reasons:

    - The GRUB project is showing signs of life again, resulting in the
    introduction of the generic Linux/UEFI boot protocol, instead of
    x86 specific hacks which are increasingly difficult to maintain.
    There's hope that all future extensions will now go through that
    boot protocol.

    - Preparatory work for RISC-V EFI support.

    The main changes are:

    - Boot time GDT handling changes

    - Simplify handling of EFI properties table on arm64

    - Generic EFI stub cleanups, to improve command line handling, file
    I/O, memory allocation, etc.

    - Introduce a generic initrd loading method based on calling back
    into the firmware, instead of relying on the x86 EFI handover
    protocol or device tree.

    - Introduce a mixed mode boot method that does not rely on the x86
    EFI handover protocol either, and could potentially be adopted by
    other architectures (if another one ever surfaces where one
    execution mode is a superset of another)

    - Clean up the contents of 'struct efi', and move out everything that
    doesn't need to be stored there.

    - Incorporate support for UEFI spec v2.8A changes that permit
    firmware implementations to return EFI_UNSUPPORTED from UEFI
    runtime services at OS runtime, and expose a mask of which ones are
    supported or unsupported via a configuration table.

    - Partial fix for the lack of by-VA cache maintenance in the
    decompressor on 32-bit ARM.

    - Changes to load device firmware from EFI boot service memory
    regions

    - Various documentation updates and minor code cleanups and fixes"

    * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
    efi/libstub/arm: Fix spurious message that an initrd was loaded
    efi/libstub/arm64: Avoid image_base value from efi_loaded_image
    partitions/efi: Fix partition name parsing in GUID partition entry
    efi/x86: Fix cast of image argument
    efi/libstub/x86: Use ULONG_MAX as upper bound for all allocations
    efi: Fix a mistype in comments mentioning efivar_entry_iter_begin()
    efi/libstub: Avoid linking libstub/lib-ksyms.o into vmlinux
    efi/x86: Preserve %ebx correctly in efi_set_virtual_address_map()
    efi/x86: Ignore the memory attributes table on i386
    efi/x86: Don't relocate the kernel unless necessary
    efi/x86: Remove extra headroom for setup block
    efi/x86: Add kernel preferred address to PE header
    efi/x86: Decompress at start of PE image load address
    x86/boot/compressed/32: Save the output address instead of recalculating it
    efi/libstub/x86: Deal with exit() boot service returning
    x86/boot: Use unsigned comparison for addresses
    efi/x86: Avoid using code32_start
    efi/x86: Make efi32_pe_entry() more readable
    efi/x86: Respect 32-bit ABI in efi32_pe_entry()
    efi/x86: Annotate the LOADED_IMAGE_PROTOCOL_GUID with SYM_DATA
    ...

    Linus Torvalds
     
  • Pull objtool updates from Ingo Molnar:
    "The biggest changes in this cycle were the vmlinux.o optimizations by
    Peter Zijlstra, which are preparatory and optimization work to run
    objtool against the much richer vmlinux.o object file, to perform
    new, whole-program section based logic. That work exposed a handful
    of problems with the existing code, which fixes and optimizations are
    merged here. The complete 'vmlinux.o and noinstr' work is still work
    in progress, targeted for v5.8.

    There's also assorted fixes and enhancements from Josh Poimboeuf.

    In particular I'd like to draw attention to commit 644592d328370,
    which turns fatal objtool errors into failed kernel builds. This
    behavior is IMO now justified on multiple grounds (it's easy currently
    to not notice an essentially corrupted kernel build), and the commit
    has been in -next testing for several weeks, but there could still be
    build failures with old or weird toolchains. Should that be widespread
    or high profile enough then I'd suggest a quick revert, to not hold up
    the merge window"

    * 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
    objtool: Re-arrange validate_functions()
    objtool: Optimize find_rela_by_dest_range()
    objtool: Delete cleanup()
    objtool: Optimize read_sections()
    objtool: Optimize find_symbol_by_name()
    objtool: Resize insn_hash
    objtool: Rename find_containing_func()
    objtool: Optimize find_symbol_*() and read_symbols()
    objtool: Optimize find_section_by_name()
    objtool: Optimize find_section_by_index()
    objtool: Add a statistics mode
    objtool: Optimize find_symbol_by_index()
    x86/kexec: Make relocate_kernel_64.S objtool clean
    x86/kexec: Use RIP relative addressing
    objtool: Rename func_for_each_insn_all()
    objtool: Rename func_for_each_insn()
    objtool: Introduce validate_return()
    objtool: Improve call destination function detection
    objtool: Fix clang switch table edge case
    objtool: Add relocation check for alternative sections
    ...

    Linus Torvalds
     
  • Pull ACPI updates from Rafael Wysocki:

    - Update the ACPICA code in the kernel to the 20200214 upstream
    release including:

    * Fix to re-enable the sleep button after wakeup (Anchal
    Agarwal).

    * Fixes for mistakes in comments and typos (Bob Moore).

    * ASL-ASL+ converter updates (Erik Kaneda).

    * Type casting cleanups (Sven Barth).

    - Clean up the intialization of the EC driver and eliminate some dead
    code from it (Rafael Wysocki).

    - Clean up the quirk tables in the AC and battery drivers (Hans de
    Goede).

    - Fix the global lock handling on x86 to ignore unspecified bit
    positions in the global lock field (Jan Engelhardt).

    - Add a new "tiny" driver for ACPI button devices exposed by VMs to
    guest kernels to send signals directly to init (Josh Triplett).

    - Add a kernel parameter to disable ACPI BGRT on x86 (Alex Hung).

    - Make the ACPI PCI host bridge and fan drivers use scnprintf() to
    avoid potential buffer overflows (Takashi Iwai).

    - Clean up assorted pieces of code:

    * Reorder "asmlinkage" to make g++ happy (Alexey Dobriyan).

    * Drop unneeded variable initialization (Colin Ian King).

    * Add missing __acquires/__releases annotations (Jules Irenge).

    * Replace list_for_each_safe() with list_for_each_entry_safe()
    (chenqiwu)"

    * tag 'acpi-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (31 commits)
    ACPICA: Update version to 20200214
    ACPI: PCI: Use scnprintf() for avoiding potential buffer overflow
    ACPI: fan: Use scnprintf() for avoiding potential buffer overflow
    ACPI: EC: Eliminate EC_FLAGS_QUERY_HANDSHAKE
    ACPI: EC: Do not clear boot_ec_is_ecdt in acpi_ec_add()
    ACPI: EC: Simplify acpi_ec_ecdt_start() and acpi_ec_init()
    ACPI: EC: Consolidate event handler installation code
    acpi/x86: ignore unspecified bit positions in the ACPI global lock field
    acpi/x86: add a kernel parameter to disable ACPI BGRT
    x86/acpi: make "asmlinkage" part first thing in the function definition
    ACPI: list_for_each_safe() -> list_for_each_entry_safe()
    ACPI: video: remove redundant assignments to variable result
    ACPI: OSL: Add missing __acquires/__releases annotations
    ACPI / battery: Cleanup Lenovo Ideapad Miix 320 DMI table entry
    ACPI / AC: Cleanup DMI quirk table
    ACPI: EC: Use fast path in acpi_ec_add() for DSDT boot EC
    ACPI: EC: Simplify acpi_ec_add()
    ACPI: EC: Drop AE_NOT_FOUND special case from ec_install_handlers()
    ACPI: EC: Avoid passing redundant argument to functions
    ACPI: EC: Avoid printing confusing messages in acpi_ec_setup()
    ...

    Linus Torvalds
     
  • Pull power management updates from Rafael Wysocki:
    "These clean up and rework the PM QoS API, address a suspend-to-idle
    wakeup regression on some ACPI-based platforms, clean up and extend a
    few cpuidle drivers, update multiple cpufreq drivers and cpufreq
    documentation, and fix a number of issues in devfreq and several other
    things all over.

    Specifics:

    - Clean up and rework the PM QoS API to simplify the code and reduce
    the size of it (Rafael Wysocki).

    - Fix a suspend-to-idle wakeup regression on Dell XPS13 9370 and
    similar platforms where the USB plug/unplug events are handled by
    the EC (Rafael Wysocki).

    - CLean up the intel_idle and PSCI cpuidle drivers (Rafael Wysocki,
    Ulf Hansson).

    - Extend the haltpoll cpuidle driver so that it can be forced to run
    on some systems where it refused to load (Maciej Szmigiero).

    - Convert several cpufreq documents to the .rst format and move the
    legacy driver documentation into one common file (Mauro Carvalho
    Chehab, Rafael Wysocki).

    - Update several cpufreq drivers:

    * Extend and fix the imx-cpufreq-dt driver (Anson Huang).

    * Improve the -EPROBE_DEFER handling and fix unwanted CPU
    overclocking on i.MX6ULL in imx6q-cpufreq (Anson Huang,
    Christoph Niedermaier).

    * Add support for Krait based SoCs to the qcom driver (Ansuel
    Smith).

    * Add support for OPP_PLUS to ti-cpufreq (Lokesh Vutla).

    * Add platform specific intermediate callbacks support to
    cpufreq-dt and update the imx6q driver (Peng Fan).

    * Simplify and consolidate some pieces of the intel_pstate
    driver and update its documentation (Rafael Wysocki, Alex
    Hung).

    - Fix several devfreq issues:

    * Remove unneeded extern keyword from a devfreq header file and
    use the DEVFREQ_GOV_UPDATE_INTERNAL event name instead of
    DEVFREQ_GOV_INTERNAL (Chanwoo Choi).

    * Fix the handling of dev_pm_qos_remove_request() result
    (Leonard Crestez).

    * Use constant name for userspace governor (Pierre Kuo).

    * Get rid of doc warnings and fix a typo (Christophe JAILLET).

    - Use built-in RCU list checking in some places in the PM core to
    avoid false-positive RCU usage warnings (Madhuparna Bhowmik).

    - Add explicit READ_ONCE()/WRITE_ONCE() annotations to low-level PM
    QoS routines (Qian Cai).

    - Fix removal of wakeup sources to avoid NULL pointer dereferences in
    a corner case (Neeraj Upadhyay).

    - Clean up the handling of hibernate compat ioctls and fix the
    related documentation (Eric Biggers).

    - Update the idle_inject power capping driver to use variable-length
    arrays instead of zero-length arrays (Gustavo Silva).

    - Fix list format in a PM QoS document (Randy Dunlap).

    - Make the cpufreq stats module use scnprintf() to avoid potential
    buffer overflows (Takashi Iwai).

    - Add pm_runtime_get_if_active() to PM-runtime API (Sakari Ailus).

    - Allow no domain-idle-states DT property in generic PM domains (Ulf
    Hansson).

    - Fix a broken y-axis scale in the intel_pstate_tracer utility (Doug
    Smythies)"

    * tag 'pm-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (78 commits)
    cpufreq: intel_pstate: Simplify intel_pstate_cpu_init()
    tools/power/x86/intel_pstate_tracer: fix a broken y-axis scale
    ACPI: PM: s2idle: Refine active GPEs check
    ACPICA: Allow acpi_any_gpe_status_set() to skip one GPE
    PM: sleep: wakeup: Skip wakeup_source_sysfs_remove() if device is not there
    PM / devfreq: Get rid of some doc warnings
    PM / devfreq: Fix handling dev_pm_qos_remove_request result
    PM / devfreq: Fix a typo in a comment
    PM / devfreq: Change to DEVFREQ_GOV_UPDATE_INTERVAL event name
    PM / devfreq: Remove unneeded extern keyword
    PM / devfreq: Use constant name of userspace governor
    ACPI: PM: s2idle: Fix comment in acpi_s2idle_prepare_late()
    cpufreq: qcom: Add support for krait based socs
    cpufreq: imx6q-cpufreq: Improve the logic of -EPROBE_DEFER handling
    cpufreq: Use scnprintf() for avoiding potential buffer overflow
    cpuidle: psci: Split psci_dt_cpu_init_idle()
    PM / Domains: Allow no domain-idle-states DT property in genpd when parsing
    PM / hibernate: Remove unnecessary compat ioctl overrides
    PM: hibernate: fix docs for ioctls that return loff_t via pointer
    Documentation: intel_pstate: update links for references
    ...

    Linus Torvalds
     
  • Pull driver core updates from Greg KH:
    "Here is the "big" set of driver core changes for 5.7-rc1.

    Nothing huge in here, just lots of little firmware core changes and
    use of new apis, a libfs fix, a debugfs api change, and some driver
    core deferred probe rework.

    All of these have been in linux-next for a while with no reported
    issues"

    * tag 'driver-core-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (44 commits)
    Revert "driver core: Set fw_devlink to "permissive" behavior by default"
    driver core: Set fw_devlink to "permissive" behavior by default
    driver core: Replace open-coded list_last_entry()
    driver core: Read atomic counter once in driver_probe_done()
    libfs: fix infoleak in simple_attr_read()
    driver core: Add device links from fwnode only for the primary device
    platform/x86: touchscreen_dmi: Add info for the Chuwi Vi8 Plus tablet
    platform/x86: touchscreen_dmi: Add EFI embedded firmware info support
    Input: icn8505 - Switch to firmware_request_platform for retreiving the fw
    Input: silead - Switch to firmware_request_platform for retreiving the fw
    selftests: firmware: Add firmware_request_platform tests
    test_firmware: add support for firmware_request_platform
    firmware: Add new platform fallback mechanism and firmware_request_platform()
    Revert "drivers: base: power: wakeup.c: Use built-in RCU list checking"
    drivers: base: power: wakeup.c: Use built-in RCU list checking
    component: allow missing unbind callback
    debugfs: remove return value of debugfs_create_file_size()
    debugfs: Check module state before warning in {full/open}_proxy_open()
    firmware: fix a double abort case with fw_load_sysfs_fallback
    arch_topology: Fix putting invalid cpu clk
    ...

    Linus Torvalds
     
  • Pull RAS updates from Borislav Petkov:

    - Do not report spurious MCEs on some Intel platforms caused by errata;
    by Prarit Bhargava.

    - Change dev-mcelog's hardcoded limit of 32 error records to a dynamic
    one, controlled by the number of logical CPUs, by Tony Luck.

    - Add support for the processor identification number (PPIN) on AMD, by
    Wei Huang.

    * tag 'ras_updates_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mce/amd: Add PPIN support for AMD MCE
    x86/mce/dev-mcelog: Dynamically allocate space for machine check records
    x86/mce: Do not log spurious corrected mce errors

    Linus Torvalds
     

30 Mar, 2020

2 commits

  • * pm-qos: (30 commits)
    PM: QoS: annotate data races in pm_qos_*_value()
    Documentation: power: fix pm_qos_interface.rst format warning
    PM: QoS: Make CPU latency QoS depend on CONFIG_CPU_IDLE
    Documentation: PM: QoS: Update to reflect previous code changes
    PM: QoS: Update file information comments
    PM: QoS: Drop PM_QOS_CPU_DMA_LATENCY and rename related functions
    sound: Call cpu_latency_qos_*() instead of pm_qos_*()
    drivers: usb: Call cpu_latency_qos_*() instead of pm_qos_*()
    drivers: tty: Call cpu_latency_qos_*() instead of pm_qos_*()
    drivers: spi: Call cpu_latency_qos_*() instead of pm_qos_*()
    drivers: net: Call cpu_latency_qos_*() instead of pm_qos_*()
    drivers: mmc: Call cpu_latency_qos_*() instead of pm_qos_*()
    drivers: media: Call cpu_latency_qos_*() instead of pm_qos_*()
    drivers: hsi: Call cpu_latency_qos_*() instead of pm_qos_*()
    drm: i915: Call cpu_latency_qos_*() instead of pm_qos_*()
    x86: platform: iosf_mbi: Call cpu_latency_qos_*() instead of pm_qos_*()
    cpuidle: Call cpu_latency_qos_limit() instead of pm_qos_request()
    PM: QoS: Add CPU latency QoS API wrappers
    PM: QoS: Adjust pm_qos_request() signature and reorder pm_qos.h
    PM: QoS: Simplify definitions of CPU latency QoS trace events
    ...

    Rafael J. Wysocki
     
  • Minor comment conflict in mac80211.

    Signed-off-by: David S. Miller

    David S. Miller
     

28 Mar, 2020

7 commits


27 Mar, 2020

4 commits

  • With the command-line option -mx86-used-note=yes which can also be
    enabled at binutils build time with:

    --enable-x86-used-note generate GNU x86 used ISA and feature properties

    the x86 assembler in binutils 2.32 and above generates a program property
    note in a note section, .note.gnu.property, to encode used x86 ISAs and
    features. But kernel linker script only contains a single NOTE segment:

    PHDRS
    {
    text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
    dynamic PT_DYNAMIC FLAGS(4); /* PF_R */
    note PT_NOTE FLAGS(4); /* PF_R */
    eh_frame_hdr 0x6474e550;
    }

    The NOTE segment generated by the vDSO linker script is aligned to 4 bytes.
    But the .note.gnu.property section must be aligned to 8 bytes on x86-64:

    [hjl@gnu-skx-1 vdso]$ readelf -n vdso64.so

    Displaying notes found in: .note
    Owner Data size Description
    Linux 0x00000004 Unknown note type: (0x00000000)
    description data: 06 00 00 00
    readelf: Warning: note with invalid namesz and/or descsz found at offset 0x20
    readelf: Warning: type: 0x78, namesize: 0x00000100, descsize: 0x756e694c, alignment: 8

    Since the note.gnu.property section in the vDSO is not checked by the
    dynamic linker, discard the .note.gnu.property sections in the vDSO.

    [ bp: Massage. ]

    Signed-off-by: H.J. Lu
    Signed-off-by: Borislav Petkov
    Reviewed-by: Kees Cook
    Link: https://lkml.kernel.org/r/20200326174314.254662-1-hjl.tools@gmail.com

    H.J. Lu
     
  • In the x86 kernel, .exit.text and .exit.data sections are discarded at
    runtime, not by the linker. Add RUNTIME_DISCARD_EXIT to generic DISCARDS
    and define it in the x86 kernel linker script to keep them.

    The sections are added before the DISCARD directive so document here
    only the situation explicitly as this change doesn't have any effect on
    the generated kernel. Also, other architectures like ARM64 will use it
    too so generalize the approach with the RUNTIME_DISCARD_EXIT define.

    [ bp: Massage and extend commit message. ]

    Signed-off-by: H.J. Lu
    Signed-off-by: Borislav Petkov
    Reviewed-by: Kees Cook
    Link: https://lkml.kernel.org/r/20200326193021.255002-1-hjl.tools@gmail.com

    H.J. Lu
     
  • In a context switch from a task that is detecting split locks to one that
    is not (or vice versa) we need to update the TEST_CTRL MSR. Currently this
    is done with the common sequence:

    read the MSR
    flip the bit
    write the MSR
    in order to avoid changing the value of any reserved bits in the MSR.

    Cache unused and reserved bits of TEST_CTRL MSR with SPLIT_LOCK_DETECT bit
    cleared during initialization, so we can avoid an expensive RDMSR
    instruction during context switch.

    Suggested-by: Sean Christopherson
    Originally-by: Tony Luck
    Signed-off-by: Xiaoyao Li
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20200325030924.132881-3-xiaoyao.li@intel.com

    Xiaoyao Li
     
  • Current initialization flow of split lock detection has following issues:

    1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
    zero. However, it's possible that BIOS/firmware has set it.

    2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
    there is a virtualization flaw that FMS indicates the existence while
    it's actually not supported.

    Rework the initialization flow to solve above issues. In detail, explicitly
    clear and set split_lock_detect bit to verify MSR_TEST_CTRL can be
    accessed, and rdmsr after wrmsr to ensure bit is cleared/set successfully.

    X86_FEATURE_SPLIT_LOCK_DETECT flag is set only when the feature does exist
    and the feature is not disabled with kernel param "split_lock_detect=off"

    On each processor, explicitly updating the SPLIT_LOCK_DETECT bit based on
    sld_sate in split_lock_init() since BIOS/firmware may touch it.

    Originally-by: Thomas Gleixner
    Signed-off-by: Xiaoyao Li
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20200325030924.132881-2-xiaoyao.li@intel.com

    Xiaoyao Li