29 May, 2019

1 commit

  • Proc filesystem has special locking rules for various files. Thus
    fanotify which opens files on event delivery can easily deadlock
    against another process that waits for fanotify permission event to be
    handled. Since permission events on /proc have doubtful value anyway,
    just disallow them.

    Link: https://lore.kernel.org/linux-fsdevel/20190320131642.GE9485@quack2.suse.cz/
    Reviewed-by: Amir Goldstein
    Signed-off-by: Jan Kara

    Jan Kara
     

27 May, 2019

3 commits

  • Linus Torvalds
     
  • Pull tracing warning fix from Steven Rostedt:
    "Make the GCC 9 warning for sub struct memset go away.

    GCC 9 now warns about calling memset() on partial structures when it
    goes across multiple fields. This adds a helper for the place in
    tracing that does this type of clearing of a structure"

    * tag 'trace-v5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Silence GCC 9 array bounds warning

    Linus Torvalds
     
  • Pull KVM fixes from Paolo Bonzini:
    "The usual smattering of fixes and tunings that came in too late for
    the merge window, but should not wait four months before they appear
    in a release.

    I also travelled a bit more than usual in the first part of May, which
    didn't help with picking up patches and reports promptly"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (33 commits)
    KVM: x86: fix return value for reserved EFER
    tools/kvm_stat: fix fields filter for child events
    KVM: selftests: Wrap vcpu_nested_state_get/set functions with x86 guard
    kvm: selftests: aarch64: compile with warnings on
    kvm: selftests: aarch64: fix default vm mode
    kvm: selftests: aarch64: dirty_log_test: fix unaligned memslot size
    KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION
    KVM: x86/pmu: do not mask the value that is written to fixed PMUs
    KVM: x86/pmu: mask the result of rdpmc according to the width of the counters
    x86/kvm/pmu: Set AMD's virt PMU version to 1
    KVM: x86: do not spam dmesg with VMCS/VMCB dumps
    kvm: Check irqchip mode before assign irqfd
    kvm: svm/avic: fix off-by-one in checking host APIC ID
    KVM: selftests: do not blindly clobber registers in guest asm
    KVM: selftests: Remove duplicated TEST_ASSERT in hyperv_cpuid.c
    KVM: LAPIC: Expose per-vCPU timer_advance_ns to userspace
    KVM: LAPIC: Fix lapic_timer_advance_ns parameter overflow
    kvm: vmx: Fix -Wmissing-prototypes warnings
    KVM: nVMX: Fix using __this_cpu_read() in preemptible context
    kvm: fix compilation on s390
    ...

    Linus Torvalds
     

26 May, 2019

6 commits

  • Pull /dev/random fix from Ted Ts'o:
    "Fix a soft lockup regression when reading from /dev/random in early
    boot"

    * tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
    random: fix soft lockup when trying to read from an uninitialized blocking pool

    Linus Torvalds
     
  • Fixes: eb9d1bf079bb: "random: only read from /dev/random after its pool has received 128 bits"
    Reported-by: kernel test robot
    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     
  • Starting with GCC 9, -Warray-bounds detects cases when memset is called
    starting on a member of a struct but the size to be cleared ends up
    writing over further members.

    Such a call happens in the trace code to clear, at once, all members
    after and including `seq` on struct trace_iterator:

    In function 'memset',
    inlined from 'ftrace_dump' at kernel/trace/trace.c:8914:3:
    ./include/linux/string.h:344:9: warning: '__builtin_memset' offset
    [8505, 8560] from the object at 'iter' is out of the bounds of
    referenced subobject 'seq' with type 'struct trace_seq' at offset
    4368 [-Warray-bounds]
    344 | return __builtin_memset(p, c, size);
    | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

    In order to avoid GCC complaining about it, we compute the address
    ourselves by adding the offsetof distance instead of referring
    directly to the member.

    Since there are two places doing this clear (trace.c and trace_kdb.c),
    take the chance to move the workaround into a single place in
    the internal header.

    Link: http://lkml.kernel.org/r/20190523124535.GA12931@gmail.com

    Signed-off-by: Miguel Ojeda
    [ Removed unnecessary parenthesis around "iter" ]
    Signed-off-by: Steven Rostedt (VMware)

    Miguel Ojeda
     
  • Pull ext4 fixes from Ted Ts'o:
    "Bug fixes (including a regression fix) for ext4"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: fix dcache lookup of !casefolded directories
    ext4: do not delete unlinked inode from orphan list on failed truncate
    ext4: wait for outstanding dio during truncate in nojournal mode
    ext4: don't perform block validity checks on the journal inode

    Linus Torvalds
     
  • Pull libnvdimm fixes from Dan Williams:

    - Fix a regression that disabled device-mapper dax support

    - Remove unnecessary hardened-user-copy overhead (>30%) for dax
    read(2)/write(2).

    - Fix some compilation warnings.

    * tag 'libnvdimm-fixes-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY overhead
    dax: Arrange for dax_supported check to span multiple devices
    libnvdimm: Fix compilation warnings with W=1

    Linus Torvalds
     
  • Pull tracing fixes from Steven Rostedt:
    "Tom Zanussi sent me some small fixes and cleanups to the histogram
    code and I forgot to incorporate them.

    I also added a small clean up patch that was sent to me a while ago
    and I just noticed it"

    * tag 'trace-v5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    kernel/trace/trace.h: Remove duplicate header of trace_seq.h
    tracing: Add a check_val() check before updating cond_snapshot() track_val
    tracing: Check keys for variable references in expressions too
    tracing: Prevent hist_field_var_ref() from accessing NULL tracing_map_elts

    Linus Torvalds
     

25 May, 2019

30 commits

  • Found by visual inspection, this wasn't caught by my xfstest, since it's
    effect is ignoring positive dentries in the cache the fallback just goes
    to the disk. it was introduced in the last iteration of the
    case-insensitive patch.

    d_compare should return 0 when the entries match, so make sure we are
    correctly comparing the entire string if the encoding feature is set and
    we are on a case-INsensitive directory.

    Fixes: b886ee3e778e ("ext4: Support case-insensitive file name lookups")
    Signed-off-by: Gabriel Krisman Bertazi
    Signed-off-by: Theodore Ts'o

    Gabriel Krisman Bertazi
     
  • Pull SCSI fixes from James Bottomley:
    "This is the same set of patches sent in the merge window as the final
    pull except that Martin's read only rework is replaced with a simple
    revert of the original change that caused the regression.

    Everything else is an obvious fix or small cleanup"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    Revert "scsi: sd: Keep disk read-only when re-reading partition"
    scsi: bnx2fc: fix incorrect cast to u64 on shift operation
    scsi: smartpqi: Reporting unhandled SCSI errors
    scsi: myrs: Fix uninitialized variable
    scsi: lpfc: Update lpfc version to 12.2.0.2
    scsi: lpfc: add check for loss of ndlp when sending RRQ
    scsi: lpfc: correct rcu unlock issue in lpfc_nvme_info_show
    scsi: lpfc: resolve lockdep warnings
    scsi: qedi: remove set but not used variables 'cdev' and 'udev'
    scsi: qedi: remove memset/memcpy to nfunc and use func instead
    scsi: qla2xxx: Add cleanup for PCI EEH recovery

    Linus Torvalds
     
  • Pull block fixes from Jens Axboe:

    - NVMe pull request from Keith, with fixes from a few folks.

    - bio and sbitmap before atomic barrier fixes (Andrea)

    - Hang fix for blk-mq freeze and unfreeze (Bob)

    - Single segment count regression fix (Christoph)

    - AoE now has a new maintainer

    - tools/io_uring/ Makefile fix, and sync with liburing (me)

    * tag 'for-linus-20190524' of git://git.kernel.dk/linux-block: (23 commits)
    tools/io_uring: sync with liburing
    tools/io_uring: fix Makefile for pthread library link
    blk-mq: fix hang caused by freeze/unfreeze sequence
    block: remove the bi_seg_{front,back}_size fields in struct bio
    block: remove the segment size check in bio_will_gap
    block: force an unlimited segment size on queues with a virt boundary
    block: don't decrement nr_phys_segments for physically contigous segments
    sbitmap: fix improper use of smp_mb__before_atomic()
    bio: fix improper use of smp_mb__before_atomic()
    aoe: list new maintainer for aoe driver
    nvme-pci: use blk-mq mapping for unmanaged irqs
    nvme: update MAINTAINERS
    nvme: copy MTFA field from identify controller
    nvme: fix memory leak for power latency tolerance
    nvme: release namespace SRCU protection before performing controller ioctls
    nvme: merge nvme_ns_ioctl into nvme_ioctl
    nvme: remove the ifdef around nvme_nvm_ioctl
    nvme: fix srcu locking on error return in nvme_get_ns_from_disk
    nvme: Fix known effects
    nvme-pci: Sync queues on reset
    ...

    Linus Torvalds
     
  • …/git/shuah/linux-kselftest

    Pull Kselftest fixes from Shuah Khan:

    - Two fixes to regressions introduced in kselftest Makefile test run
    output refactoring work (Kees Cook)

    - Adding Atom support to syscall_arg_fault test (Tong Bo)

    * tag 'linux-kselftest-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
    selftests/timers: Add missing fflush(stdout) calls
    selftests: Remove forced unbuffering for test running
    selftests/x86: Support Atom for syscall_arg_fault test

    Linus Torvalds
     
  • Pull Devicetree fixes from Rob Herring:

    - Update checkpatch.pl to use DT vendor-prefixes.yaml

    - Fix DT binding references to files converted to DT schema

    - Clean-up Arm CPU binding examples to match schema

    - Add Sifive block versioning scheme documentation

    - Pass binding directory base to validation tools for reference lookups

    * tag 'devicetree-fixes-for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
    checkpatch.pl: Update DT vendor prefix check
    dt: bindings: mtd: replace references to nand.txt with nand-controller.yaml
    dt-bindings: interrupt-controller: arm,gic: Fix schema errors in example
    dt-bindings: arm: Clean up CPU binding examples
    dt: fix refs that were renamed to json with the same file name
    dt-bindings: Pass binding directory to validation tools
    dt-bindings: sifive: describe sifive-blocks versioning

    Linus Torvalds
     
  • Pule more SPDX updates from Greg KH:
    "Here is another set of reviewed patches that adds SPDX tags to
    different kernel files, based on a set of rules that are being used to
    parse the comments to try to determine that the license of the file is
    "GPL-2.0-or-later".

    Only the "obvious" versions of these matches are included here, a
    number of "non-obvious" variants of text have been found but those
    have been postponed for later review and analysis.

    These patches have been out for review on the linux-spdx@vger mailing
    list, and while they were created by automatic tools, they were
    hand-verified by a bunch of different people, all whom names are on
    the patches are reviewers"

    * tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (85 commits)
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 125
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 123
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 122
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 121
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 120
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 116
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 114
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 113
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 112
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 111
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 110
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 106
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 105
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 103
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 101
    treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98
    ...

    Linus Torvalds
     
  • The kernel test robot has reported that the use of __this_cpu_add()
    causes bug messages like:

    BUG: using __this_cpu_add() in preemptible [00000000] code: ...

    Given the imprecise nature of the count and the possibility of resetting
    the count and doing the measurement again, this is not really a big
    problem to use the unprotected __this_cpu_*() functions.

    To make the preemption checking code happy, the this_cpu_*() functions
    will be used if CONFIG_DEBUG_PREEMPT is defined.

    The imprecise nature of the locking counts are also documented with
    the suggestion that we should run the measurement a few times with the
    counts reset in between to get a better picture of what is going on
    under the hood.

    Fixes: a8654596f0371 ("locking/rwsem: Enable lock event counting")
    Suggested-by: Linus Torvalds
    Signed-off-by: Waiman Long
    Signed-off-by: Linus Torvalds

    Waiman Long
     
  • Commit 11988499e62b ("KVM: x86: Skip EFER vs. guest CPUID checks for
    host-initiated writes", 2019-04-02) introduced a "return false" in a
    function returning int, and anyway set_efer has a "nonzero on error"
    conventon so it should be returning 1.

    Reported-by: Pavel Machek
    Fixes: 11988499e62b ("KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes")
    Cc: Sean Christopherson
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • The fields filter would not work with child fields, as the respective
    parents would not be included. No parents displayed == no childs displayed.
    To reproduce, run on s390 (would work on other platforms, too, but would
    require a different filter name):
    - Run 'kvm_stat -d'
    - Press 'f'
    - Enter 'instruct'
    Notice that events like instruction_diag_44 or instruction_diag_500 are not
    displayed - the output remains empty.
    With this patch, we will filter by matching events and their parents.
    However, consider the following example where we filter by
    instruction_diag_44:

    kvm statistics - summary
    regex filter: instruction_diag_44
    Event Total %Total CurAvg/s
    exit_instruction 276 100.0 12
    instruction_diag_44 256 92.8 11
    Total 276 12

    Note that the parent ('exit_instruction') displays the total events, but
    the childs listed do not match its total (256 instead of 276). This is
    intended (since we're filtering all but one child), but might be confusing
    on first sight.

    Signed-off-by: Stefan Raspl
    Signed-off-by: Paolo Bonzini

    Stefan Raspl
     
  • struct kvm_nested_state is only available on x86 so far. To be able
    to compile the code on other architectures as well, we need to wrap
    the related code with #ifdefs.

    Signed-off-by: Thomas Huth
    Signed-off-by: Paolo Bonzini

    Thomas Huth
     
  • aarch64 fixups needed to compile with warnings as errors.

    Reviewed-by: Thomas Huth
    Signed-off-by: Andrew Jones
    Signed-off-by: Paolo Bonzini

    Andrew Jones
     
  • VM_MODE_P52V48_4K is not a valid mode for AArch64. Replace its
    use in vm_create_default() with a mode that works and represents
    a good AArch64 default. (We didn't ever see a problem with this
    because we don't have any unit tests using vm_create_default(),
    but it's good to get it fixed in advance.)

    Reported-by: Thomas Huth
    Signed-off-by: Andrew Jones
    Signed-off-by: Paolo Bonzini

    Andrew Jones
     
  • The memory slot size must be aligned to the host's page size. When
    testing a guest with a 4k page size on a host with a 64k page size,
    then 3 guest pages are not host page size aligned. Since we just need
    a nearly arbitrary number of extra pages to ensure the memslot is not
    aligned to a 64 host-page boundary for this test, then we can use
    16, as that's 64k aligned, but not 64 * 64k aligned.

    Fixes: 76d58e0f07ec ("KVM: fix KVM_CLEAR_DIRTY_LOG for memory slots of unaligned size", 2019-04-17)
    Signed-off-by: Andrew Jones
    Signed-off-by: Paolo Bonzini

    Andrew Jones
     
  • kselftests exposed a problem in the s390 handling for memory slots.
    Right now we only do proper memory slot handling for creation of new
    memory slots. Neither MOVE, nor DELETION are handled properly. Let us
    implement those.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Paolo Bonzini

    Christian Borntraeger
     
  • According to the SDM, for MSR_IA32_PERFCTR0/1 "the lower-order 32 bits of
    each MSR may be written with any value, and the high-order 8 bits are
    sign-extended according to the value of bit 31", but the fixed counters
    in real hardware are limited to the width of the fixed counters ("bits
    beyond the width of the fixed-function counter are reserved and must be
    written as zeros"). Fix KVM to do the same.

    Reported-by: Nadav Amit
    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • This patch will simplify the changes in the next, by enforcing the
    masking of the counters to RDPMC and RDMSR.

    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • After commit:

    672ff6cff80c ("KVM: x86: Raise #GP when guest vCPU do not support PMU")

    my AMD guests started #GPing like this:

    general protection fault: 0000 [#1] PREEMPT SMP
    CPU: 1 PID: 4355 Comm: bash Not tainted 5.1.0-rc6+ #3
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
    RIP: 0010:x86_perf_event_update+0x3b/0xa0

    with Code: pointing to RDPMC. It is RDPMC because the guest has the
    hardware watchdog CONFIG_HARDLOCKUP_DETECTOR_PERF enabled which uses
    perf. Instrumenting kvm_pmu_rdpmc() some, showed that it fails due to:

    if (!pmu->version)
    return 1;

    which the above commit added. Since AMD's PMU leaves the version at 0,
    that causes the #GP injection into the guest.

    Set pmu->version arbitrarily to 1 and move it above the non-applicable
    struct kvm_pmu members.

    Signed-off-by: Borislav Petkov
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Janakarajan Natarajan
    Cc: kvm@vger.kernel.org
    Cc: Liran Alon
    Cc: Mihai Carabas
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: "Radim Krčmář"
    Cc: Thomas Gleixner
    Cc: Tom Lendacky
    Cc: x86@kernel.org
    Cc: stable@vger.kernel.org
    Fixes: 672ff6cff80c ("KVM: x86: Raise #GP when guest vCPU do not support PMU")
    Signed-off-by: Paolo Bonzini

    Borislav Petkov
     
  • Userspace can easily set up invalid processor state in such a way that
    dmesg will be filled with VMCS or VMCB dumps. Disable this by default
    using a module parameter.

    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • When assigning kvm irqfd we didn't check the irqchip mode but we allow
    KVM_IRQFD to succeed with all the irqchip modes. However it does not
    make much sense to create irqfd even without the kernel chips. Let's
    provide a arch-dependent helper to check whether a specific irqfd is
    allowed by the arch. At least for x86, it should make sense to check:

    - when irqchip mode is NONE, all irqfds should be disallowed, and,

    - when irqchip mode is SPLIT, irqfds that are with resamplefd should
    be disallowed.

    For either of the case, previously we'll silently ignore the irq or
    the irq ack event if the irqchip mode is incorrect. However that can
    cause misterious guest behaviors and it can be hard to triage. Let's
    fail KVM_IRQFD even earlier to detect these incorrect configurations.

    CC: Paolo Bonzini
    CC: Radim Krčmář
    CC: Alex Williamson
    CC: Eduardo Habkost
    Signed-off-by: Peter Xu
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Bonzini

    Peter Xu
     
  • Current logic does not allow VCPU to be loaded onto CPU with
    APIC ID 255. This should be allowed since the host physical APIC ID
    field in the AVIC Physical APIC table entry is an 8-bit value,
    and APIC ID 255 is valid in system with x2APIC enabled.
    Instead, do not allow VCPU load if the host APIC ID cannot be
    represented by an 8-bit value.

    Also, use the more appropriate AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK
    instead of AVIC_MAX_PHYSICAL_ID_COUNT.

    Signed-off-by: Suravee Suthikulpanit
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Bonzini

    Suthikulpanit, Suravee
     
  • The guest_code of sync_regs_test is assuming that the compiler will not
    touch %r11 outside the asm that increments it, which is a bit brittle.
    Instead, we can increment a variable and use a dummy asm to ensure the
    increment is not optimized away. However, we also need to use a
    callee-save register or the compiler will insert a save/restore around
    the vmexit, breaking the whole idea behind the test.

    (Yes, "if it ain't broken...", but I would like the test to be clean
    before it is copied into the upcoming s390 selftests).

    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • The check for entry->index == 0 is done twice. One time should
    be sufficient.

    Suggested-by: Vitaly Kuznetsov
    Signed-off-by: Thomas Huth
    Signed-off-by: Paolo Bonzini

    Thomas Huth
     
  • Expose per-vCPU timer_advance_ns to userspace, so it is able to
    query the auto-adjusted value.

    Cc: Paolo Bonzini
    Cc: Radim Krčmář
    Cc: Sean Christopherson
    Cc: Liran Alon
    Signed-off-by: Wanpeng Li
    Signed-off-by: Paolo Bonzini

    Wanpeng Li
     
  • After commit c3941d9e0 (KVM: lapic: Allow user to disable adaptive tuning of
    timer advancement), '-1' enables adaptive tuning starting from default
    advancment of 1000ns. However, we should expose an int instead of an overflow
    uint module parameter.

    Before patch:

    /sys/module/kvm/parameters/lapic_timer_advance_ns:4294967295

    After patch:

    /sys/module/kvm/parameters/lapic_timer_advance_ns:-1

    Fixes: c3941d9e0 (KVM: lapic: Allow user to disable adaptive tuning of timer advancement)
    Cc: Paolo Bonzini
    Cc: Radim Krčmář
    Cc: Sean Christopherson
    Cc: Liran Alon
    Reviewed-by: Sean Christopherson
    Signed-off-by: Wanpeng Li
    Signed-off-by: Paolo Bonzini

    Wanpeng Li
     
  • We get a warning when build kernel W=1:
    arch/x86/kvm/vmx/vmx.c:6365:6: warning: no previous prototype for ‘vmx_update_host_rsp’ [-Wmissing-prototypes]
    void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)

    Add the missing declaration to fix this.

    Signed-off-by: Yi Wang
    Signed-off-by: Paolo Bonzini

    Yi Wang
     
  • BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/4590
    caller is nested_vmx_enter_non_root_mode+0xebd/0x1790 [kvm_intel]
    CPU: 4 PID: 4590 Comm: qemu-system-x86 Tainted: G OE 5.1.0-rc4+ #1
    Call Trace:
    dump_stack+0x67/0x95
    __this_cpu_preempt_check+0xd2/0xe0
    nested_vmx_enter_non_root_mode+0xebd/0x1790 [kvm_intel]
    nested_vmx_run+0xda/0x2b0 [kvm_intel]
    handle_vmlaunch+0x13/0x20 [kvm_intel]
    vmx_handle_exit+0xbd/0x660 [kvm_intel]
    kvm_arch_vcpu_ioctl_run+0xa2c/0x1e50 [kvm]
    kvm_vcpu_ioctl+0x3ad/0x6d0 [kvm]
    do_vfs_ioctl+0xa5/0x6e0
    ksys_ioctl+0x6d/0x80
    __x64_sys_ioctl+0x1a/0x20
    do_syscall_64+0x6f/0x6c0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Accessing per-cpu variable should disable preemption, this patch extends the
    preemption disable region for __this_cpu_read().

    Cc: Paolo Bonzini
    Cc: Radim Krčmář
    Signed-off-by: Wanpeng Li
    Fixes: 52017608da33 ("KVM: nVMX: add option to perform early consistency checks via H/W")
    Cc: stable@vger.kernel.org
    Reviewed-by: Sean Christopherson
    Signed-off-by: Paolo Bonzini

    Wanpeng Li
     
  • s390 does not have memremap, even though in this particular case it
    would be useful.

    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • Kvm now supports extended CPUID functions through 0x8000001f. CPUID
    leaf 0x8000001e is AMD's Processor Topology Information leaf. This
    contains similar information to CPUID leaf 0xb (Intel's Extended
    Topology Enumeration leaf), and should be included in the output of
    KVM_GET_SUPPORTED_CPUID, even though userspace is likely to override
    some of this information based upon the configuration of the
    particular VM.

    Cc: Brijesh Singh
    Cc: Borislav Petkov
    Fixes: 8765d75329a38 ("KVM: X86: Extend CPUID range to include new leaf")
    Signed-off-by: Jim Mattson
    Reviewed-by: Marc Orr
    Reviewed-by: Borislav Petkov
    Signed-off-by: Paolo Bonzini

    Jim Mattson
     
  • Per the APM, "CPUID Fn8000_001D_E[D,C,B,A]X reports cache topology
    information for the cache enumerated by the value passed to the
    instruction in ECX, referred to as Cache n in the following
    description. To gather information for all cache levels, software must
    repeatedly execute CPUID with 8000_001Dh in EAX and ECX set to
    increasing values beginning with 0 until a value of 00h is returned in
    the field CacheType (EAX[4:0]) indicating no more cache descriptions
    are available for this processor."

    The termination condition is the same as leaf 4, so we can reuse that
    code block for leaf 0x8000001d.

    Fixes: 8765d75329a38 ("KVM: X86: Extend CPUID range to include new leaf")
    Cc: Brijesh Singh
    Cc: Borislav Petkov
    Signed-off-by: Jim Mattson
    Reviewed-by: Marc Orr
    Reviewed-by: Borislav Petkov
    Signed-off-by: Paolo Bonzini

    Jim Mattson
     
  • So far the KVM selftests are compiled without any compiler warnings
    enabled. That's quite bad, since we miss a lot of possible bugs this
    way. Let's enable at least "-Wall" and some other useful warning flags
    now, and fix at least the trivial problems in the code (like unused
    variables).

    Signed-off-by: Thomas Huth
    Signed-off-by: Paolo Bonzini

    Thomas Huth