03 Sep, 2019

1 commit


02 Sep, 2019

2 commits

  • Pull x86 fixes from Thomas Gleixner:
    "A set of fixes for x86:

    - Fix the bogus detection of 32bit user mode for uretprobes which
    caused corruption of the user return address resulting in
    application crashes. In the uprobes handler in_ia32_syscall() is
    obviously always returning false on a 64bit kernel. Use
    user_64bit_mode() instead which works correctly.

    - Prevent large page splitting when ftrace flips RW/RO on the kernel
    text which caused iTLB performance issues. Ftrace wants to be
    converted to text_poke() which avoids the problem, but for now
    allow large page preservation in the static protections check when
    the change request spawns a full large page.

    - Prevent arch_dynirq_lower_bound() from returning 0 when the IOAPIC
    is configured via device tree. In the device tree case the GSI 1:1
    mapping is meaningless therefore the lower bound which protects the
    GSI range on ACPI machines is irrelevant. Return the lower bound
    which the core hands to the function instead of blindly returning 0
    which causes the core to allocate the invalid virtual interupt
    number 0 which in turn prevents all drivers from allocating and
    requesting an interrupt.

    - Remove the bogus initialization of LDR and DFR in the 32bit bigsmp
    APIC driver. That uses physical destination mode where LDR/DFR are
    ignored, but the initialization and the missing clear of LDR caused
    the APIC to be left in a inconsistent state on kexec/reboot.

    - Clear LDR when clearing the APIC registers so the APIC is in a well
    defined state.

    - Initialize variables proper in the find_trampoline_placement()
    code.

    - Silence GCC( build warning for the real mode part of the build"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text
    x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning
    x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement()
    x86/apic: Include the LDR when clearing out APIC registers
    x86/apic: Do not initialize LDR and DFR for bigsmp
    uprobes/x86: Fix detection of 32-bit user mode
    x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines

    Linus Torvalds
     
  • Pull perf fixes from Thomas Gleixner:
    "Two fixes for perf x86 hardware implementations:

    - Restrict the period on Nehalem machines to prevent perf from
    hogging the CPU

    - Prevent the AMD IBS driver from overwriting the hardwre controlled
    and pre-seeded reserved bits (0-6) in the count register which
    caused a sample bias for dispatched micro-ops"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86/amd/ibs: Fix sample bias for dispatched micro-ops
    perf/x86/intel: Restrict period on Nehalem

    Linus Torvalds
     

01 Sep, 2019

2 commits

  • Pull tracing fixes from Steven Rostedt:
    "Small fixes and minor cleanups for tracing:

    - Make exported ftrace function not static

    - Fix NULL pointer dereference in reading probes as they are created

    - Fix NULL pointer dereference in k/uprobe clean up path

    - Various documentation fixes"

    * tag 'trace-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Correct kdoc formats
    ftrace/x86: Remove mcount() declaration
    tracing/probe: Fix null pointer dereference
    tracing: Make exported ftrace_set_clr_event non-static
    ftrace: Check for successful allocation of hash
    ftrace: Check for empty hash and comment the race with registering probes
    ftrace: Fix NULL pointer dereference in t_probe_next()

    Linus Torvalds
     
  • Pull RISC-V fix from Paul Walmsley:
    "One significant fix for 32-bit RISC-V systems:

    Fix the RV32 memory map to prevent userspace from corrupting the
    FIXMAP area. Without this patch, the system can crash very early
    during the boot"

    * tag 'riscv/for-v5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
    RISC-V: Fix FIXMAP area corruption on RV32 systems

    Linus Torvalds
     

31 Aug, 2019

4 commits

  • Pull KVM fixes from Radim Krčmář:
    "PPC:
    - Fix bug which could leave locks held in the host on return to a
    guest.

    x86:
    - Prevent infinitely looping emulation of a failing syscall while
    single stepping.

    - Do not crash the host when nesting is disabled"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: x86: Don't update RIP or do single-step on faulting emulation
    KVM: x86: hyper-v: don't crash on KVM_GET_SUPPORTED_HV_CPUID when kvm_intel.nested is disabled
    KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handling

    Linus Torvalds
     
  • Commit 562e14f72292 ("ftrace/x86: Remove mcount support") removed the
    support for using mcount, so we could remove the mcount() declaration
    to clean up.

    Link: http://lkml.kernel.org/r/20190826170150.10f101ba@xhacker.debian

    Signed-off-by: Jisheng Zhang
    Signed-off-by: Steven Rostedt (VMware)

    Jisheng Zhang
     
  • Pull ARM fixes from Russell King:
    "Three fixes for ARM this time around:

    - A fix for update_sections_early() to cope with NULL ->mm pointers.

    - A correction to the backtrace code to allow proper backtraces.

    - Reinforcement of pfn_valid() with PFNs >= 4GiB"

    * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
    ARM: 8901/1: add a criteria for pfn_valid of arm
    ARM: 8897/1: check stmfd instruction using right shift
    ARM: 8874/1: mm: only adjust sections of valid mm structures

    Linus Torvalds
     
  • Pull ARM SoC fixes from Arnd Bergmann:
    "The majority of the fixes this time are for OMAP hardware, here is a
    breakdown of the significant changes:

    Various device tree bug fixes:
    - TI am57xx boards need a voltage level fix to avoid damaging SD
    cards
    - vf610-bk4 fails to detect its flash due to an incorrect description
    - meson-g12a USB phy configuration fails
    - meson-g12b reboot should not power off the SD card
    - Some corrections for apparently harmless differences from the
    documentation.

    Regression fixes:
    - ams-delta FIQ interrupts broke in 5.3
    - TI am3/am4 mmc controllers broke in 5.2

    The logic_pio driver (used on some Huawei ARM servers) got a few bug
    fixes for reliability.

    And a couple of compile-time warning fixes"

    * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (26 commits)
    soc: ixp4xx: Protect IXP4xx SoC drivers by ARCH_IXP4XX || COMPILE_TEST
    soc: ti: pm33xx: Make two symbols static
    soc: ti: pm33xx: Fix static checker warnings
    ARM: OMAP: dma: Mark expected switch fall-throughs
    ARM: dts: Fix incomplete dts data for am3 and am4 mmc
    bus: ti-sysc: Simplify cleanup upon failures in sysc_probe()
    ARM: OMAP1: ams-delta-fiq: Fix missing irq_ack
    ARM: dts: dra74x: Fix iodelay configuration for mmc3
    ARM: dts: am335x: Fix UARTs length
    ARM: OMAP2+: Fix omap4 errata warning on other SoCs
    bus: hisi_lpc: Add .remove method to avoid driver unbind crash
    bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free
    lib: logic_pio: Add logic_pio_unregister_range()
    lib: logic_pio: Avoid possible overlap for unregistering regions
    lib: logic_pio: Fix RCU usage
    arm64: dts: amlogic: odroid-n2: keep SD card regulator always on
    arm64: dts: meson-g12a-sei510: enable IR controller
    arm64: dts: meson-g12a: add missing dwc2 phy-names
    ARM: dts: vf610-bk4: Fix qspi node description
    ARM: dts: Fix incorrect dcan register mapping for am3, am4 and dra7
    ...

    Linus Torvalds
     

30 Aug, 2019

8 commits

  • When counting dispatched micro-ops with cnt_ctl=1, in order to prevent
    sample bias, IBS hardware preloads the least significant 7 bits of
    current count (IbsOpCurCnt) with random values, such that, after the
    interrupt is handled and counting resumes, the next sample taken
    will be slightly perturbed.

    The current count bitfield is in the IBS execution control h/w register,
    alongside the maximum count field.

    Currently, the IBS driver writes that register with the maximum count,
    leaving zeroes to fill the current count field, thereby overwriting
    the random bits the hardware preloaded for itself.

    Fix the driver to actually retain and carry those random bits from the
    read of the IBS control register, through to its write, instead of
    overwriting the lower current count bits with zeroes.

    Tested with:

    perf record -c 100001 -e ibs_op/cnt_ctl=1/pp -a -C 0 taskset -c 0

    'perf annotate' output before:

    15.70 65: addsd %xmm0,%xmm1
    17.30 add $0x1,%rax
    15.88 cmp %rdx,%rax
    je 82
    17.32 72: test $0x1,%al
    jne 7c
    7.52 movapd %xmm1,%xmm0
    5.90 jmp 65
    8.23 7c: sqrtsd %xmm1,%xmm0
    12.15 jmp 65

    'perf annotate' output after:

    16.63 65: addsd %xmm0,%xmm1
    16.82 add $0x1,%rax
    16.81 cmp %rdx,%rax
    je 82
    16.69 72: test $0x1,%al
    jne 7c
    8.30 movapd %xmm1,%xmm0
    8.13 jmp 65
    8.24 7c: sqrtsd %xmm1,%xmm0
    8.39 jmp 65

    Tested on Family 15h and 17h machines.

    Machines prior to family 10h Rev. C don't have the RDWROPCNT capability,
    and have the IbsOpCurCnt bitfield reserved, so this patch shouldn't
    affect their operation.

    It is unknown why commit db98c5faf8cb ("perf/x86: Implement 64-bit
    counter support for IBS") ignored the lower 4 bits of the IbsOpCurCnt
    field; the number of preloaded random bits has always been 7, AFAICT.

    Signed-off-by: Kim Phillips
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: "Arnaldo Carvalho de Melo"
    Cc:
    Cc: Ingo Molnar
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Thomas Gleixner
    Cc: "Borislav Petkov"
    Cc: Stephane Eranian
    Cc: Alexander Shishkin
    Cc: "Namhyung Kim"
    Cc: "H. Peter Anvin"
    Link: https://lkml.kernel.org/r/20190826195730.30614-1-kim.phillips@amd.com

    Kim Phillips
     
  • We see our Nehalem machines reporting 'perfevents: irq loop stuck!' in
    some cases when using perf:

    perfevents: irq loop stuck!
    WARNING: CPU: 0 PID: 3485 at arch/x86/events/intel/core.c:2282 intel_pmu_handle_irq+0x37b/0x530
    ...
    RIP: 0010:intel_pmu_handle_irq+0x37b/0x530
    ...
    Call Trace:

    ? perf_event_nmi_handler+0x2e/0x50
    ? intel_pmu_save_and_restart+0x50/0x50
    perf_event_nmi_handler+0x2e/0x50
    nmi_handle+0x6e/0x120
    default_do_nmi+0x3e/0x100
    do_nmi+0x102/0x160
    end_repeat_nmi+0x16/0x50
    ...
    ? native_write_msr+0x6/0x20
    ? native_write_msr+0x6/0x20

    intel_pmu_enable_event+0x1ce/0x1f0
    x86_pmu_start+0x78/0xa0
    x86_pmu_enable+0x252/0x310
    __perf_event_task_sched_in+0x181/0x190
    ? __switch_to_asm+0x41/0x70
    ? __switch_to_asm+0x35/0x70
    ? __switch_to_asm+0x41/0x70
    ? __switch_to_asm+0x35/0x70
    finish_task_switch+0x158/0x260
    __schedule+0x2f6/0x840
    ? hrtimer_start_range_ns+0x153/0x210
    schedule+0x32/0x80
    schedule_hrtimeout_range_clock+0x8a/0x100
    ? hrtimer_init+0x120/0x120
    ep_poll+0x2f7/0x3a0
    ? wake_up_q+0x60/0x60
    do_epoll_wait+0xa9/0xc0
    __x64_sys_epoll_wait+0x1a/0x20
    do_syscall_64+0x4e/0x110
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7fdeb1e96c03
    ...
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: acme@kernel.org
    Cc: Josh Hunt
    Cc: bpuranda@akamai.com
    Cc: mingo@redhat.com
    Cc: jolsa@redhat.com
    Cc: tglx@linutronix.de
    Cc: namhyung@kernel.org
    Cc: alexander.shishkin@linux.intel.com
    Link: https://lkml.kernel.org/r/1566256411-18820-1-git-send-email-johunt@akamai.com

    Josh Hunt
     
  • CONFIG_ARCH_QCOM is a dependency of the above and selects
    CONFIG_{PINCTRL, REGULATOR, TMPFS}.

    Bug: 133441279
    Bug: 133441092
    Bug: 133440650
    Change-Id: I22c37946ec3a62ccbd3fa65bbc09076964d86475
    Signed-off-by: Tri Vo

    Tri Vo
     
  • Legacy Ion driver and SPARSEMEM for carveout regions results
    in invalid page structures breaking page_to_pfn(). This can
    be temporarily resolved with SPARSEMEM_VMEMMAP until the Ion
    driver is refactored and can be reinvestigated.

    At that time if it can be solved, or maybe correct this issue
    utilizing less resources than SPARSEMEM_VMEMMAP requires. The
    ABI does not change so we have the flexibility to adjust this
    configuration.

    Signed-off-by: Mark Salyzyn
    Bug: 138851285
    Bug: 138149732
    Test: ABI_DEFINITION=common/abi_gki_aarch64.xml \
    BUILD_CONFIG=common/build.config.gki.aarch64 ./build/build_abi.sh
    Change-Id: I25cc8ebe9e25260b9869c5e8d8667b280f83ca51

    Mark Salyzyn
     
  • ftrace does not use text_poke() for enabling trace functionality. It uses
    its own mechanism and flips the whole kernel text to RW and back to RO.

    The CPA rework removed a loop based check of 4k pages which tried to
    preserve a large page by checking each 4k page whether the change would
    actually cover all pages in the large page.

    This resulted in endless loops for nothing as in testing it turned out that
    it actually never preserved anything. Of course testing missed to include
    ftrace, which is the one and only case which benefitted from the 4k loop.

    As a consequence enabling function tracing or ftrace based kprobes results
    in a full 4k split of the kernel text, which affects iTLB performance.

    The kernel RO protection is the only valid case where this can actually
    preserve large pages.

    All other static protections (RO data, data NX, PCI, BIOS) are truly
    static. So a conflict with those protections which results in a split
    should only ever happen when a change of memory next to a protected region
    is attempted. But these conflicts are rightfully splitting the large page
    to preserve the protected regions. In fact a change to the protected
    regions itself is a bug and is warned about.

    Add an exception for the static protection check for kernel text RO when
    the to be changed region spawns a full large page which allows to preserve
    the large mappings. This also prevents the syslog to be spammed about CPA
    violations when ftrace is used.

    The exception needs to be removed once ftrace switched over to text_poke()
    which avoids the whole issue.

    Fixes: 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely")
    Reported-by: Song Liu
    Signed-off-by: Thomas Gleixner
    Tested-by: Song Liu
    Reviewed-by: Song Liu
    Acked-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de

    Thomas Gleixner
     
  • …kernel/git/gustavoars/linux

    Pull fallthrough fixes from Gustavo A. R. Silva:
    "Fix fall-through warnings on arc and nds32 for multiple
    configurations"

    * tag 'Wimplicit-fallthrough-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
    nds32: Mark expected switch fall-throughs
    ARC: unwind: Mark expected switch fall-through

    Linus Torvalds
     
  • Mark switch cases where we are expecting to fall through.

    This patch fixes the following warnings (Building: allmodconfig nds32):

    include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
    arch/nds32/kernel/signal.c:362:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
    arch/nds32/kernel/signal.c:315:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/soft-fp.h:124:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:417:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:430:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:310:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
    include/math-emu/op-common.h:320:11: warning: this statement may fall through [-Wimplicit-fallthrough=]

    Reported-by: Michael Ellerman
    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     
  • Mark switch cases where we are expecting to fall through.

    This patch fixes the following warnings (Building: haps_hs_defconfig arc):

    arch/arc/kernel/unwind.c: In function ‘read_pointer’:
    ./include/linux/compiler.h:328:5: warning: this statement may fall through [-Wimplicit-fallthrough=]
    do { \
    ^
    ./include/linux/compiler.h:338:2: note: in expansion of macro ‘__compiletime_assert’
    __compiletime_assert(condition, msg, prefix, suffix)
    ^~~~~~~~~~~~~~~~~~~~
    ./include/linux/compiler.h:350:2: note: in expansion of macro ‘_compiletime_assert’
    _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
    ^~~~~~~~~~~~~~~~~~~
    ./include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’
    #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
    ^~~~~~~~~~~~~~~~~~
    ./include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’
    BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition)
    ^~~~~~~~~~~~~~~~
    arch/arc/kernel/unwind.c:573:3: note: in expansion of macro ‘BUILD_BUG_ON’
    BUILD_BUG_ON(sizeof(u32) != sizeof(value));
    ^~~~~~~~~~~~
    arch/arc/kernel/unwind.c:575:2: note: here
    case DW_EH_PE_native:
    ^~~~

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

29 Aug, 2019

3 commits

  • HiKey/HiKey960 need UEFI support to boot but don't need much of
    the other options that default on when enabling EFI.

    Bug: 140204135
    Signed-off-by: John Stultz
    Change-Id: I5c2e63701ae93277fcc3ddb36a39637237c65194

    John Stultz
     
  • pfn_valid can be wrong when parsing a invalid pfn whose phys address
    exceeds BITS_PER_LONG as the MSB will be trimed when shifted.

    The issue originally arise from bellowing call stack, which corresponding to
    an access of the /proc/kpageflags from userspace with a invalid pfn parameter
    and leads to kernel panic.

    [46886.723249] c7 [] (stable_page_flags) from []
    [46886.723264] c7 [] (kpageflags_read) from []
    [46886.723280] c7 [] (proc_reg_read) from []
    [46886.723290] c7 [] (__vfs_read) from []
    [46886.723301] c7 [] (vfs_read) from []
    [46886.723315] c7 [] (SyS_pread64) from []
    (ret_fast_syscall+0x0/0x28)

    Signed-off-by: Zhaoyang Huang
    Signed-off-by: Russell King

    zhaoyang
     
  • Currently, various virtual memory areas of Linux RISC-V are organized
    in increasing order of their virtual addresses is as follows:
    1. User space area (This is lowest area and starts at 0x0)
    2. FIXMAP area
    3. VMALLOC area
    4. Kernel area (This is highest area and starts at PAGE_OFFSET)

    The maximum size of user space aread is represented by TASK_SIZE.

    On RV32 systems, TASK_SIZE is defined as VMALLOC_START which causes the
    user space area to overlap the FIXMAP area. This allows user space apps
    to potentially corrupt the FIXMAP area and kernel OF APIs will crash
    whenever they access corrupted FDT in the FIXMAP area.

    On RV64 systems, TASK_SIZE is set to fixed 256GB and no other areas
    happen to overlap so we don't see any FIXMAP area corruptions.

    This patch fixes FIXMAP area corruption on RV32 systems by setting
    TASK_SIZE to FIXADDR_START. We also move FIXADDR_TOP, FIXADDR_SIZE,
    and FIXADDR_START defines to asm/pgtable.h so that we can avoid cyclic
    header includes.

    Signed-off-by: Anup Patel
    Tested-by: Alistair Francis
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Paul Walmsley

    Anup Patel
     

28 Aug, 2019

5 commits

  • One of the very few warnings I have in the current build comes from
    arch/x86/boot/edd.c, where I get the following with a gcc9 build:

    arch/x86/boot/edd.c: In function ‘query_edd’:
    arch/x86/boot/edd.c:148:11: warning: taking address of packed member of ‘struct boot_params’ may result in an unaligned pointer value [-Waddress-of-packed-member]
    148 | mbrptr = boot_params.edd_mbr_sig_buffer;
    | ^~~~~~~~~~~

    This warning triggers because we throw away all the CFLAGS and then make
    a new set for REALMODE_CFLAGS, so the -Wno-address-of-packed-member we
    added in the following commit is not present:

    6f303d60534c ("gcc-9: silence 'address-of-packed-member' warning")

    The simplest solution for now is to adjust the warning for this version
    of CFLAGS as well, but it would definitely make sense to examine whether
    REALMODE_CFLAGS could be derived from CFLAGS, so that it picks up changes
    in the compiler flags environment automatically.

    Signed-off-by: Linus Torvalds
    Acked-by: Borislav Petkov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Linus Torvalds
     
  • Don't advance RIP or inject a single-step #DB if emulation signals a
    fault. This logic applies to all state updates that are conditional on
    clean retirement of the emulation instruction, e.g. updating RFLAGS was
    previously handled by commit 38827dbd3fb85 ("KVM: x86: Do not update
    EFLAGS on faulting emulation").

    Not advancing RIP is likely a nop, i.e. ctxt->eip isn't updated with
    ctxt->_eip until emulation "retires" anyways. Skipping #DB injection
    fixes a bug reported by Andy Lutomirski where a #UD on SYSCALL due to
    invalid state with EFLAGS.TF=1 would loop indefinitely due to emulation
    overwriting the #UD with #DB and thus restarting the bad SYSCALL over
    and over.

    Cc: Nadav Amit
    Cc: stable@vger.kernel.org
    Reported-by: Andy Lutomirski
    Fixes: 663f4c61b803 ("KVM: x86: handle singlestep during emulation")
    Signed-off-by: Sean Christopherson
    Signed-off-by: Radim Krčmář

    Sean Christopherson
     
  • If kvm_intel is loaded with nested=0 parameter an attempt to perform
    KVM_GET_SUPPORTED_HV_CPUID results in OOPS as nested_get_evmcs_version hook
    in kvm_x86_ops is NULL (we assign it in nested_vmx_hardware_setup() and
    this only happens in case nested is enabled).

    Check that kvm_x86_ops->nested_get_evmcs_version is not NULL before
    calling it. With this, we can remove the stub from svm as it is no
    longer needed.

    Cc:
    Fixes: e2e871ab2f02 ("x86/kvm/hyper-v: Introduce nested_get_evmcs_version() helper")
    Signed-off-by: Vitaly Kuznetsov
    Reviewed-by: Jim Mattson
    Signed-off-by: Radim Krčmář

    Vitaly Kuznetsov
     
  • Pull ARC updates from Vineet Gupta:

    - support for Edge Triggered IRQs in ARC IDU intc

    - other fixes here and there

    * tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    arc: prefer __section from compiler_attributes.h
    dt-bindings: IDU-intc: Add support for edge-triggered interrupts
    dt-bindings: IDU-intc: Clean up documentation
    ARCv2: IDU-intc: Add support for edge-triggered interrupts
    ARC: unwind: Mark expected switch fall-throughs
    ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations
    ARC: fix typo in setup_dma_ops log message
    ARCv2: entry: early return from exception need not clear U & DE bits

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya
    Leoshkevich.

    2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for
    SMC from Jason Baron.

    3) ipv6_mc_may_pull() should return 0 for malformed packets, not
    -EINVAL. From Stefano Brivio.

    4) Don't forget to unpin umem xdp pages in error path of
    xdp_umem_reg(). From Ivan Khoronzhuk.

    5) Fix sta object leak in mac80211, from Johannes Berg.

    6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2
    switches. From Florian Fainelli.

    7) Revert DMA sync removal from r8169 which was causing regressions on
    some MIPS Loongson platforms. From Heiner Kallweit.

    8) Use after free in flow dissector, from Jakub Sitnicki.

    9) Fix NULL derefs of net devices during ICMP processing across
    collect_md tunnels, from Hangbin Liu.

    10) proto_register() memory leaks, from Zhang Lin.

    11) Set NLM_F_MULTI flag in multipart netlink messages consistently,
    from John Fastabend.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits)
    r8152: Set memory to all 0xFFs on failed reg reads
    openvswitch: Fix conntrack cache with timeout
    ipv4: mpls: fix mpls_xmit for iptunnel
    nexthop: Fix nexthop_num_path for blackhole nexthops
    net: rds: add service level support in rds-info
    net: route dump netlink NLM_F_MULTI flag missing
    s390/qeth: reject oversized SNMP requests
    sock: fix potential memory leak in proto_register()
    MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT
    xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode
    ipv4/icmp: fix rt dst dev null pointer dereference
    openvswitch: Fix log message in ovs conntrack
    bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0
    bpf: fix use after free in prog symbol exposure
    bpf: fix precision tracking in presence of bpf2bpf calls
    flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH
    Revert "r8169: remove not needed call to dma_sync_single_for_device"
    ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev
    net/ncsi: Fix the payload copying for the request coming from Netlink
    qed: Add cleanup in qed_slowpath_start()
    ...

    Linus Torvalds
     

27 Aug, 2019

10 commits

  • KVM/PPC fix for 5.3

    - Fix bug which could leave locks locked in the host on return
    to a guest.

    Radim Krčmář
     
  • Gustavo noticed that 'new' can be left uninitialized if 'bios_start'
    happens to be less or equal to 'entry->addr + entry->size'.

    Initialize the variable at the begin of the iteration to the current value
    of 'bios_start'.

    Fixes: 0a46fff2f910 ("x86/boot/compressed/64: Fix boot on machines with broken E820 table")
    Reported-by: "Gustavo A. R. Silva"
    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20190826133326.7cxb4vbmiawffv2r@box

    Kirill A. Shutemov
     
  • H_PUT_TCE_INDIRECT handlers receive a page with up to 512 TCEs from
    a guest. Although we verify correctness of TCEs before we do anything
    with the existing tables, there is a small window when a check in
    kvmppc_tce_validate might pass and right after that the guest alters
    the page of TCEs, causing an early exit from the handler and leaving
    srcu_read_lock(&vcpu->kvm->srcu) (virtual mode) or lock_rmap(rmap)
    (real mode) locked.

    This fixes the bug by jumping to the common exit code with an appropriate
    unlock.

    Cc: stable@vger.kernel.org # v4.11+
    Fixes: 121f80ba68f1 ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO")
    Signed-off-by: Alexey Kardashevskiy
    Signed-off-by: Paul Mackerras

    Alexey Kardashevskiy
     
  • This reverts commit 0d4e5ac7e78035950d564e65c38ce148cb9af681.

    Reason: Broke UML used by kernel_tests

    Bug: 139897923
    Change-Id: Ibf57c1f535e60caaef32dd14c4abbe253d8e185d
    Signed-off-by: Alistair Delva

    Alistair Delva
     
  • This reverts commit 1987b1b8f9f17a06255877e7917d0bb5b5377774.

    Reason: Broke UML used by kernel_tests

    Bug: 139897923
    Change-Id: If3541721fdca7cf6d77410309ae5b503b5a848d0
    Signed-off-by: Alistair Delva

    Alistair Delva
     
  • Consensus is that CONFIG_NR_CPUS of 32 will deal with the future
    products with a moderate engineering margin.

    Signed-off-by: Mark Salyzyn
    Test: confirm value propagates to .config
    Bug: 139693734
    Bug: 139406736
    Bug: 139692860
    Change-Id: I9687d37da254a612947398a45ae56ab01e676562

    Mark Salyzyn
     
  • Although APIC initialization will typically clear out the LDR before
    setting it, the APIC cleanup code should reset the LDR.

    This was discovered with a 32-bit KVM guest jumping into a kdump
    kernel. The stale bits in the LDR triggered a bug in the KVM APIC
    implementation which caused the destination mapping for VCPUs to be
    corrupted.

    Note that this isn't intended to paper over the KVM APIC bug. The kernel
    has to clear the LDR when resetting the APIC registers except when X2APIC
    is enabled.

    This lacks a Fixes tag because missing to clear LDR goes way back into pre
    git history.

    [ tglx: Made x2apic_enabled a function call as required ]

    Signed-off-by: Bandan Das
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190826101513.5080-3-bsd@redhat.com

    Bandan Das
     
  • Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The
    bigsmp APIC implementation uses physical destination mode, but it
    nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with
    multiple bit being set.

    This does not cause a functional problem because LDR and DFR are ignored
    when physical destination mode is active, but it triggered a problem on a
    32-bit KVM guest which jumps into a kdump kernel.

    The multiple bits set unearthed a bug in the KVM APIC implementation. The
    code which creates the logical destination map for VCPUs ignores the
    disabled state of the APIC and ends up overwriting an existing valid entry
    and as a result, APIC calibration hangs in the guest during kdump
    initialization.

    Remove the bogus LDR/DFR initialization.

    This is not intended to work around the KVM APIC bug. The LDR/DFR
    ininitalization is wrong on its own.

    The issue goes back into the pre git history. The fixes tag is the commit
    in the bitkeeper import which introduced bigsmp support in 2003.

    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git

    Fixes: db7b9e9f26b8 ("[PATCH] Clustered APIC setup for >8 CPU systems")
    Suggested-by: Thomas Gleixner
    Signed-off-by: Bandan Das
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190826101513.5080-2-bsd@redhat.com

    Bandan Das
     
  • Reported-by: Sedat Dilek
    Suggested-by: Josh Poimboeuf
    Signed-off-by: Nick Desaulniers
    Signed-off-by: Vineet Gupta

    Nick Desaulniers
     
  • This adds support for an optional extra interrupt cell to specify edge
    vs level triggered. It is backward compatible with dts files with only
    one cell, and will default to level-triggered in such a case.

    Note that I had to make a change to idu_irq_set_affinity as well, as
    this function was setting the interrupt type to "level" unconditionally,
    since this was the only type supported previously.

    Signed-off-by: Mischa Jonker
    Reviewed-by: Vineet Gupta
    Signed-off-by: Vineet Gupta

    Mischa Jonker
     

26 Aug, 2019

5 commits

  • Linux 5.3-rc6

    Signed-off-by: Greg Kroah-Hartman
    Change-Id: Id10580d48d56054408b3efe0bd1866d67aba2a3d

    Greg Kroah-Hartman
     
  • Linux 5.3-rc5

    Signed-off-by: Greg Kroah-Hartman
    Change-Id: Ibfaea1b9aca9f04a59def096f327c2afbd0cb296

    Greg Kroah-Hartman
     
  • 32-bit processes running on a 64-bit kernel are not always detected
    correctly, causing the process to crash when uretprobes are installed.

    The reason for the crash is that in_ia32_syscall() is used to determine the
    process's mode, which only works correctly when called from a syscall.

    In the case of uretprobes, however, the function is called from a exception
    and always returns 'false' on a 64-bit kernel. In consequence this leads to
    corruption of the process's return address.

    Fix this by using user_64bit_mode() instead of in_ia32_syscall(), which
    is correct in any situation.

    [ tglx: Add a comment and the following historical info ]

    This should have been detected by the rename which happened in commit

    abfb9498ee13 ("x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()")

    which states in the changelog:

    The is_ia32_task()/is_x32_task() function names are a big misnomer: they
    suggests that the compat-ness of a system call is a task property, which
    is not true, the compatness of a system call purely depends on how it
    was invoked through the system call layer.
    .....

    and then it went and blindly renamed every call site.

    Sadly enough this was already mentioned here:

    8faaed1b9f50 ("uprobes/x86: Introduce sizeof_long(), cleanup adjust_ret_addr() and
    arch_uretprobe_hijack_return_addr()")

    where the changelog says:

    TODO: is_ia32_task() is not what we actually want, TS_COMPAT does
    not necessarily mean 32bit. Fortunately syscall-like insns can't be
    probed so it actually works, but it would be better to rename and
    use is_ia32_frame().

    and goes all the way back to:

    0326f5a94dde ("uprobes/core: Handle breakpoint and singlestep exceptions")

    Oh well. 7+ years until someone actually tried a uretprobe on a 32bit
    process on a 64bit kernel....

    Fixes: 0326f5a94dde ("uprobes/core: Handle breakpoint and singlestep exceptions")
    Signed-off-by: Sebastian Mayr
    Signed-off-by: Thomas Gleixner
    Cc: Masami Hiramatsu
    Cc: Dmitry Safonov
    Cc: Oleg Nesterov
    Cc: Srikar Dronamraju
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190728152617.7308-1-me@sam.st

    Sebastian Mayr
     
  • Rahul Tanwar reported the following bug on DT systems:

    > 'ioapic_dynirq_base' contains the virtual IRQ base number. Presently, it is
    > updated to the end of hardware IRQ numbers but this is done only when IOAPIC
    > configuration type is IOAPIC_DOMAIN_LEGACY or IOAPIC_DOMAIN_STRICT. There is
    > a third type IOAPIC_DOMAIN_DYNAMIC which applies when IOAPIC configuration
    > comes from devicetree.
    >
    > See dtb_add_ioapic() in arch/x86/kernel/devicetree.c
    >
    > In case of IOAPIC_DOMAIN_DYNAMIC (DT/OF based system), 'ioapic_dynirq_base'
    > remains to zero initialized value. This means that for OF based systems,
    > virtual IRQ base will get set to zero.

    Such systems will very likely not even boot.

    For DT enabled machines ioapic_dynirq_base is irrelevant and not
    updated, so simply map the IRQ base 1:1 instead.

    Reported-by: Rahul Tanwar
    Tested-by: Rahul Tanwar
    Tested-by: Andy Shevchenko
    Signed-off-by: Thomas Gleixner
    Cc: Alexander Shishkin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: alan@linux.intel.com
    Cc: bp@alien8.de
    Cc: cheol.yong.kim@intel.com
    Cc: qi-ming.wu@intel.com
    Cc: rahul.tanwar@intel.com
    Cc: rppt@linux.ibm.com
    Cc: tony.luck@intel.com
    Link: http://lkml.kernel.org/r/20190821081330.1187-1-rahul.tanwar@linux.intel.com
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Pull UML fix from Richard Weinberger:
    "Fix time travel mode"

    * tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
    um: fix time travel mode

    Linus Torvalds