27 Dec, 2016

2 commits


26 Dec, 2016

3 commits

  • Pull timer type cleanups from Thomas Gleixner:
    "This series does a tree wide cleanup of types related to
    timers/timekeeping.

    - Get rid of cycles_t and use a plain u64. The type is not really
    helpful and caused more confusion than clarity

    - Get rid of the ktime union. The union has become useless as we use
    the scalar nanoseconds storage unconditionally now. The 32bit
    timespec alike storage got removed due to the Y2038 limitations
    some time ago.

    That leaves the odd union access around for no reason. Clean it up.

    Both changes have been done with coccinelle and a small amount of
    manual mopping up"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    ktime: Get rid of ktime_equal()
    ktime: Cleanup ktime_set() usage
    ktime: Get rid of the union
    clocksource: Use a plain u64 instead of cycle_t

    Linus Torvalds
     
  • Pull SMP hotplug notifier removal from Thomas Gleixner:
    "This is the final cleanup of the hotplug notifier infrastructure. The
    series has been reintgrated in the last two days because there came a
    new driver using the old infrastructure via the SCSI tree.

    Summary:

    - convert the last leftover drivers utilizing notifiers

    - fixup for a completely broken hotplug user

    - prevent setup of already used states

    - removal of the notifiers

    - treewide cleanup of hotplug state names

    - consolidation of state space

    There is a sphinx based documentation pending, but that needs review
    from the documentation folks"

    * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/armada-xp: Consolidate hotplug state space
    irqchip/gic: Consolidate hotplug state space
    coresight/etm3/4x: Consolidate hotplug state space
    cpu/hotplug: Cleanup state names
    cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
    staging/lustre/libcfs: Convert to hotplug state machine
    scsi/bnx2i: Convert to hotplug state machine
    scsi/bnx2fc: Convert to hotplug state machine
    cpu/hotplug: Prevent overwriting of callbacks
    x86/msr: Remove bogus cleanup from the error path
    bus: arm-ccn: Prevent hotplug callback leak
    perf/x86/intel/cstate: Prevent hotplug callback leak
    ARM/imx/mmcd: Fix broken cpu hotplug handling
    scsi: qedi: Convert to hotplug state machine

    Linus Torvalds
     
  • ktime_set(S,N) was required for the timespec storage type and is still
    useful for situations where a Seconds and Nanoseconds part of a time value
    needs to be converted. For anything where the Seconds argument is 0, this
    is pointless and can be replaced with a simple assignment.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra

    Thomas Gleixner
     

25 Dec, 2016

5 commits

  • There is no point in having an extra type for extra confusion. u64 is
    unambiguous.

    Conversion was done with the following coccinelle script:

    @rem@
    @@
    -typedef u64 cycle_t;

    @fix@
    typedef cycle_t;
    @@
    -cycle_t
    +u64

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: John Stultz

    Thomas Gleixner
     
  • When the state names got added a script was used to add the extra argument
    to the calls. The script basically converted the state constant to a
    string, but the cleanup to convert these strings into meaningful ones did
    not happen.

    Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
    are used in all the other places already.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Sebastian Siewior
    Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The error cleanup which is invoked when the hotplug state setup failed
    tries to remove the failed state, which is broken.

    Fixes: 8fba38c937cd ("x86/msr: Convert to hotplug state machine")
    Reported-by: kernel test robot
    Signed-off-by: Thomas Gleixner
    Cc: Sebastian Siewior

    Thomas Gleixner
     
  • If the pmu registration fails the registered hotplug callbacks are not
    removed. Wrong in any case, but fatal in case of a modular driver.

    Replace the nonsensical state names with proper ones while at it.

    Fixes: 77c34ef1c319 ("perf/x86/intel/cstate: Convert Intel CSTATE to hotplug state machine")
    Signed-off-by: Thomas Gleixner
    Cc: Sebastian Siewior
    Cc: Peter Zijlstra
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     
  • This was entirely automated, using the script by Al:

    PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
    sed -i -e "s!$PATT!#include !" \
    $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

    to do the replacement at the end of the merge window.

    Requested-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

24 Dec, 2016

3 commits

  • Pull x86 fixes from Ingo Molnar:
    "There's a number of fixes:

    - a round of fixes for CPUID-less legacy CPUs
    - a number of microcode loader fixes
    - i8042 detection robustization fixes
    - stack dump/unwinder fixes
    - x86 SoC platform driver fixes
    - a GCC 7 warning fix
    - virtualization related fixes"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
    Revert "x86/unwind: Detect bad stack return address"
    x86/paravirt: Mark unused patch_default label
    x86/microcode/AMD: Reload proper initrd start address
    x86/platform/intel/quark: Add printf attribute to imr_self_test_result()
    x86/platform/intel-mid: Switch MPU3050 driver to IIO
    x86/alternatives: Do not use sync_core() to serialize I$
    x86/topology: Document cpu_llc_id
    x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
    x86/asm: Rewrite sync_core() to use IRET-to-self
    x86/microcode/intel: Replace sync_core() with native_cpuid()
    Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
    x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
    x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
    x86/tools: Fix gcc-7 warning in relocs.c
    x86/unwind: Dump stack data on warnings
    x86/unwind: Adjust last frame check for aligned function stacks
    x86/init: Fix a couple of comment typos
    x86/init: Remove i8042_detect() from platform ops
    Input: i8042 - Trust firmware a bit more when probing on X86
    x86/init: Add i8042 state to the platform data
    ...

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "On the kernel side there's two x86 PMU driver fixes and a uprobes fix,
    plus on the tooling side there's a number of fixes and some late
    updates"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    perf sched timehist: Fix invalid period calculation
    perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
    perf sched timehist: Enlarge default 'comm_width'
    perf sched timehist: Honour 'comm_width' when aligning the headers
    perf/x86: Fix overlap counter scheduling bug
    perf/x86/pebs: Fix handling of PEBS buffer overflows
    samples/bpf: Move open_raw_sock to separate header
    samples/bpf: Remove perf_event_open() declaration
    samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
    tools lib bpf: Add bpf_prog_{attach,detach}
    samples/bpf: Switch over to libbpf
    perf diff: Do not overwrite valid build id
    perf annotate: Don't throw error for zero length symbols
    perf bench futex: Fix lock-pi help string
    perf trace: Check if MAP_32BIT is defined (again)
    samples/bpf: Make perf_event_read() static
    uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation
    samples/bpf: Make samples more libbpf-centric
    tools lib bpf: Add flags to bpf_create_map()
    tools lib bpf: use __u32 from linux/types.h
    ...

    Linus Torvalds
     
  • Revert the following commit:

    b6959a362177 ("x86/unwind: Detect bad stack return address")

    ... because Andrey Konovalov reported an unwinder warning:

    WARNING: unrecognized kernel stack return address ffffffffa0000001 at ffff88006377fa18 in a.out:4467

    The unwind was initiated from an interrupt which occurred while running in the
    generated code for a kprobe. The unwinder printed the warning because it
    expected regs->ip to point to a valid text address, but instead it pointed to
    the generated code.

    Eventually we may want come up with a way to identify generated kprobe
    code so the unwinder can know that it's a valid return address. Until
    then, just remove the warning.

    Reported-by: Andrey Konovalov
    Signed-off-by: Josh Poimboeuf
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

23 Dec, 2016

4 commits

  • Pull x86 cache allocation interface from Thomas Gleixner:
    "This provides support for Intel's Cache Allocation Technology, a cache
    partitioning mechanism.

    The interface is odd, but the hardware interface of that CAT stuff is
    odd as well.

    We tried hard to come up with an abstraction, but that only allows
    rather simple partitioning, but no way of sharing and dealing with the
    per package nature of this mechanism.

    In the end we decided to expose the allocation bitmaps directly so all
    combinations of the hardware can be utilized.

    There are two ways of associating a cache partition:

    - Task

    A task can be added to a resource group. It uses the cache
    partition associated to the group.

    - CPU

    All tasks which are not member of a resource group use the group to
    which the CPU they are running on is associated with.

    That allows for simple CPU based partitioning schemes.

    The main expected user sare:

    - Virtualization so a VM can only trash only the associated part of
    the cash w/o disturbing others

    - Real-Time systems to seperate RT and general workloads.

    - Latency sensitive enterprise workloads

    - In theory this also can be used to protect against cache side
    channel attacks"

    [ Intel RDT is "Resource Director Technology". The interface really is
    rather odd and very specific, which delayed this pull request while I
    was thinking about it. The pull request itself came in early during
    the merge window, I just delayed it until things had calmed down and I
    had more time.

    But people tell me they'll use this, and the good news is that it is
    _so_ specific that it's rather independent of anything else, and no
    user is going to depend on the interface since it's pretty rare. So if
    push comes to shove, we can just remove the interface and nothing will
    break ]

    * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
    x86/intel_rdt: Implement show_options() for resctrlfs
    x86/intel_rdt: Call intel_rdt_sched_in() with preemption disabled
    x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
    x86/intel_rdt: Fix setting of closid when adding CPUs to a group
    x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee
    x86/intel_rdt: Reset per cpu closids on unmount
    x86/intel_rdt: Select KERNFS when enabling INTEL_RDT_A
    x86/intel_rdt: Prevent deadlock against hotplug lock
    x86/intel_rdt: Protect info directory from removal
    x86/intel_rdt: Add info files to Documentation
    x86/intel_rdt: Export the minimum number of set mask bits in sysfs
    x86/intel_rdt: Propagate error in rdt_mount() properly
    x86/intel_rdt: Add a missing #include
    MAINTAINERS: Add maintainer for Intel RDT resource allocation
    x86/intel_rdt: Add scheduler hook
    x86/intel_rdt: Add schemata file
    x86/intel_rdt: Add tasks files
    x86/intel_rdt: Add cpus file
    x86/intel_rdt: Add mkdir to resctrl file system
    x86/intel_rdt: Add "info" files to resctrl file system
    ...

    Linus Torvalds
     
  • Jiri reported the overlap scheduling exceeding its max stack.

    Looking at the constraint that triggered this, it turns out the
    overlap marker isn't needed.

    The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
    the counter mask of such an event is not a subset of any other counter
    mask of a constraint with an equal or higher weight".

    Esp. that latter part is of interest here I think, our overlapping mask
    is 0x0e, that has 3 bits set and is the highest weight mask in on the
    PMU, therefore it will be placed last. Can we still create a scenario
    where we would need to rewind that?

    The scenario for AMD Fam15h is we're having masks like:

    0x3F -- 111111
    0x38 -- 111000
    0x07 -- 000111

    0x09 -- 001001

    And we mark 0x09 as overlapping, because it is not a direct subset of
    0x38 or 0x07 and has less weight than either of those. This means we'll
    first try and place the 0x09 event, then try and place 0x38/0x07 events.
    Now imagine we have:

    3 * 0x07 + 0x09

    and the initial pick for the 0x09 event is counter 0, then we'll fail to
    place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
    event, and then re-try all 0x07 events, which will now work.

    The masks on the PMU in question are:

    0x01 - 0001
    0x03 - 0011
    0x0e - 1110
    0x0c - 1100

    But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
    0x1) are of heavier weight, it should all work out.

    Reported-by: Jiri Olsa
    Tested-by: Jiri Olsa
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Liang Kan
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/20161109155153.GQ3142@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • This patch solves a race condition between PEBS and the PMU handler.

    In case multiple PEBS events are sampled at the same time,
    it is possible to have GLOBAL_STATUS bit 62 set indicating
    PEBS buffer overflow and also seeing at most 3 PEBS counters
    having their bits set in the status register. This is a sign
    that there was at least one PEBS record pending at the time
    of the PMU interrupt. PEBS counters must only be processed
    via the drain_pebs() calls, and not via the regular sample
    processing loop coming after that the function, otherwise
    phony regular samples may be generated in the sampling buffer
    not marked with the EXACT tag.

    Another possibility is to have one PEBS event and at least
    one non-PEBS event whic hoverflows while PEBS has armed. In this
    case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
    status bit for the PEBS counter will be on Skylake.

    To avoid this problem, we systematically ignore the PEBS-enabled
    counters from the GLOBAL_STATUS mask and we always process PEBS
    events via drain_pebs().

    The problem manifested itself by having non-exact samples when
    sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
    not have the EXACT flag set.

    Note that this problem is only present on Skylake processor.
    This fix is harmless on older processors.

    Reported-by: Peter Zijlstra
    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     
  • A bugfix commit:

    45dbea5f55c0 ("x86/paravirt: Fix native_patch()")

    ... introduced a harmless warning:

    arch/x86/kernel/paravirt_patch_32.c: In function 'native_patch':
    arch/x86/kernel/paravirt_patch_32.c:71:1: error: label 'patch_default' defined but not used [-Werror=unused-label]

    Fix it by annotating the label as __maybe_unused.

    Reported-by: Arnd Bergmann
    Reported-by: Piotr Gregor
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: 45dbea5f55c0 ("x86/paravirt: Fix native_patch()")
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

21 Dec, 2016

1 commit

  • When we switch to virtual addresses and, especially after
    reserve_initrd()->relocate_initrd() have run, we have the updated initrd
    address in initrd_start. Use initrd_start then instead of the address
    which has been passed to us through boot params. (That still gets used
    when we're running the very early routines on the BSP).

    Reported-and-tested-by: Boris Ostrovsky
    Signed-off-by: Borislav Petkov
    Link: http://lkml.kernel.org/r/20161220144012.lc4cwrg6dphqbyqu@pd.tnic
    Signed-off-by: Thomas Gleixner

    Borislav Petkov
     

20 Dec, 2016

5 commits

  • __printf() attributes help detecting issues in printf() format strings at
    compile time.

    Even though imr_selftest.c is only compiled with
    CONFIG_DEBUG_IMR_SELFTEST=y, GCC complains about a missing format
    attribute when compiling allmodconfig with -Wmissing-format-attribute.

    Silence this warning by adding the attribute.

    Signed-off-by: Nicolas Iooss
    Acked-by: Bryan O'Donoghue
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20161219132144.4108-1-nicolas.iooss_linux@m4x.org
    Signed-off-by: Ingo Molnar

    Nicolas Iooss
     
  • The Intel Mid goes in and creates a I2C device for the
    MPU3050 if the input driver for MPU-3050 is activated.

    As of commit:

    3904b28efb2c ("iio: gyro: Add driver for the MPU-3050 gyroscope")

    .. there is a proper and fully featured IIO driver for this
    device, so deprecate the use of the incomplete input driver
    by augmenting the device population code to react to the
    presence of the IIO driver's Kconfig symbol instead.

    Signed-off-by: Linus Walleij
    Acked-by: Andy Shevchenko
    Cc: Dmitry Torokhov
    Cc: Jonathan Cameron
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1481722794-4348-1-git-send-email-linus.walleij@linaro.org
    Signed-off-by: Ingo Molnar

    Linus Walleij
     
  • We use sync_core() in the alternatives code to stop speculative
    execution of prefetched instructions because we are potentially changing
    them and don't want to execute stale bytes.

    What it does on most machines is call CPUID which is a serializing
    instruction. And that's expensive.

    However, the instruction cache is serialized when we're on the local CPU
    and are changing the data through the same virtual address. So then, we
    don't need the serializing CPUID but a simple control flow change. Last
    being accomplished with a CALL/RET which the noinline causes.

    Suggested-by: Linus Torvalds
    Signed-off-by: Borislav Petkov
    Reviewed-by: Andy Lutomirski
    Cc: Andrew Cooper
    Cc: Andy Lutomirski
    Cc: Brian Gerst
    Cc: Henrique de Moraes Holschuh
    Cc: Matthew Whitehead
    Cc: One Thousand Gnomes
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20161203150258.vwr5zzco7ctgc4pe@pd.tnic
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     
  • There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
    which injects NMI to the guest. We may want to crash the guest and do kdump
    on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
    allow the kdump kernel to re-establish VMBus connection so it will see
    VMBus devices (storage, network,..).

    To properly unload VMBus making it possible to start over during kdump we
    need to do the following:

    - Send an 'unload' message to the hypervisor. This can be done on any CPU
    so we do this the crashing CPU.

    - Receive the 'unload finished' reply message. WS2012R2 delivers this
    message to the CPU which was used to establish VMBus connection during
    module load and this CPU may differ from the CPU sending 'unload'.

    Receiving a VMBus message means the following:

    - There is a per-CPU slot in memory for one message. This slot can in
    theory be accessed by any CPU.

    - We get an interrupt on the CPU when a message was placed into the slot.

    - When we read the message we need to clear the slot and signal the fact
    to the hypervisor. In case there are more messages to this CPU pending
    the hypervisor will deliver the next message. The signaling is done by
    writing to an MSR so this can only be done on the appropriate CPU.

    To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
    function which checks message slots for all CPUs in a loop waiting for the
    'unload finished' messages. However, there is an issue which arises when
    these conditions are met:

    - We're crashing on a CPU which is different from the one which was used
    to initially contact the hypervisor.

    - The CPU which was used for the initial contact is blocked with interrupts
    disabled and there is a message pending in the message slot.

    In this case we won't be able to read the 'unload finished' message on the
    crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
    simultaneously: the first CPU entering panic() will proceed to crash and
    all other CPUs will stop themselves with interrupts disabled.

    The suggested solution is to handle unknown NMIs for Hyper-V guests on the
    first CPU which gets them only. This will allow us to rely on VMBus
    interrupt handler being able to receive the 'unload finish' message in
    case it is delivered to a different CPU.

    The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
    boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.

    Signed-off-by: Vitaly Kuznetsov
    Acked-by: K. Y. Srinivasan
    Cc: devel@linuxdriverproject.org
    Cc: Haiyang Zhang
    Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.com
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Vitaly Kuznetsov
     
  • Pull KVM fixes from Paolo Bonzini:
    "Early fixes for x86.

    Instead of the (botched) revert, the lockdep/might_sleep splat has a
    real fix provided by Andrea"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)
    kvm: take srcu lock around kvm_steal_time_set_preempted()
    kvm: fix schedule in atomic in kvm_steal_time_set_preempted()
    KVM: hyperv: fix locking of struct kvm_hv fields
    KVM: x86: Expose Intel AVX512IFMA/AVX512VBMI/SHA features to guest.
    kvm: nVMX: Correct a VMX instruction error code for VMPTRLD

    Linus Torvalds
     

19 Dec, 2016

17 commits

  • When L2 exits to L0 due to "exception or NMI", software exceptions
    (#BP and #OF) for which L1 has requested an intercept should be
    handled by L1 rather than L0. Previously, only hardware exceptions
    were forwarded to L1.

    Signed-off-by: Jim Mattson
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Bonzini

    Jim Mattson
     
  • kvm_memslots() will be called by kvm_write_guest_offset_cached() so
    take the srcu lock.

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Paolo Bonzini

    Andrea Arcangeli
     
  • kvm_steal_time_set_preempted() isn't disabling the pagefaults before
    calling __copy_to_user and the kernel debug notices.

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Paolo Bonzini

    Andrea Arcangeli
     
  • Aside from being excessively slow, CPUID is problematic: Linux runs
    on a handful of CPUs that don't have CPUID. Use IRET-to-self
    instead. IRET-to-self works everywhere, so it makes testing easy.

    For reference, On my laptop, IRET-to-self is ~110ns,
    CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
    and MOV-to-CR2 is ~42ns.

    While we're at it: sync_core() serves a very specific purpose.
    Document it.

    Signed-off-by: Andy Lutomirski
    Cc: Juergen Gross
    Cc: One Thousand Gnomes
    Cc: Peter Zijlstra
    Cc: Brian Gerst
    Cc: Matthew Whitehead
    Cc: Borislav Petkov
    Cc: Henrique de Moraes Holschuh
    Cc: Andrew Cooper
    Cc: Boris Ostrovsky
    Cc: xen-devel
    Link: http://lkml.kernel.org/r/5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org
    Signed-off-by: Thomas Gleixner

    Andy Lutomirski
     
  • The Intel microcode driver is using sync_core() to mean "do CPUID
    with EAX=1". I want to rework sync_core(), but first the Intel
    microcode driver needs to stop depending on its current behavior.

    Reported-by: Henrique de Moraes Holschuh
    Signed-off-by: Andy Lutomirski
    Acked-by: Borislav Petkov
    Cc: Juergen Gross
    Cc: One Thousand Gnomes
    Cc: Peter Zijlstra
    Cc: Brian Gerst
    Cc: Matthew Whitehead
    Cc: Andrew Cooper
    Cc: Boris Ostrovsky
    Cc: xen-devel
    Link: http://lkml.kernel.org/r/535a025bb91fed1a019c5412b036337ad239e5bb.1481307769.git.luto@kernel.org
    Signed-off-by: Thomas Gleixner

    Andy Lutomirski
     
  • This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

    The patch wasn't quite correct -- there are non-Intel (and hence
    non-486) CPUs that we support that don't have CPUID. Since we no
    longer require CPUID for sync_core(), just revert the patch.

    I think the relevant CPUs are Geode and Elan, but I'm not sure.

    In principle, we should try to do better at identifying CPUID-less
    CPUs in early boot, but that's more complicated.

    Reported-by: One Thousand Gnomes
    Signed-off-by: Andy Lutomirski
    Cc: Juergen Gross
    Cc: Denys Vlasenko
    Cc: Peter Zijlstra
    Cc: Brian Gerst
    Cc: Josh Poimboeuf
    Cc: Matthew Whitehead
    Cc: Borislav Petkov
    Cc: Henrique de Moraes Holschuh
    Cc: Andrew Cooper
    Cc: Boris Ostrovsky
    Cc: xen-devel
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org
    Signed-off-by: Thomas Gleixner

    Andy Lutomirski
     
  • We support various non-Intel CPUs that don't have the CPUID
    instruction, so the M486 test was wrong. For now, fix it with a big
    hammer: handle missing CPUID on all 32-bit CPUs.

    Reported-by: One Thousand Gnomes
    Signed-off-by: Andy Lutomirski
    Cc: Juergen Gross
    Cc: Peter Zijlstra
    Cc: Brian Gerst
    Cc: Matthew Whitehead
    Cc: Borislav Petkov
    Cc: Henrique de Moraes Holschuh
    Cc: Andrew Cooper
    Cc: Boris Ostrovsky
    Cc: xen-devel
    Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769.git.luto@kernel.org
    Signed-off-by: Thomas Gleixner

    Andy Lutomirski
     
  • A typo (or mis-merge?) resulted in leaf 6 only being probed if
    cpuid_level >= 7.

    Fixes: 2ccd71f1b278 ("x86/cpufeature: Move some of the scattered feature bits to x86_capability")
    Signed-off-by: Andy Lutomirski
    Acked-by: Borislav Petkov
    Cc: Brian Gerst
    Link: http://lkml.kernel.org/r/6ea30c0e9daec21e488b54761881a6dfcf3e04d0.1481825597.git.luto@kernel.org
    Signed-off-by: Thomas Gleixner

    Andy Lutomirski
     
  • gcc-7 warns:

    In file included from arch/x86/tools/relocs_64.c:17:0:
    arch/x86/tools/relocs.c: In function ‘process_64’:
    arch/x86/tools/relocs.c:953:2: warning: argument 1 null where non-null expected [-Wnonnull]
    qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from arch/x86/tools/relocs.h:6:0,
    from arch/x86/tools/relocs_64.c:1:
    /usr/include/stdlib.h:741:13: note: in a call to function ‘qsort’ declared here
    extern void qsort

    This happens because relocs16 is not used for ELF_BITS == 64,
    so there is no point in trying to sort it.

    Make the sort_relocs(&relocs16) call 32bit only.

    Signed-off-by: Markus Trippelsdorf
    Link: http://lkml.kernel.org/r/20161215124513.GA289@x4
    Signed-off-by: Thomas Gleixner

    Markus Trippelsdorf
     
  • The unwinder warnings are good at finding unexpected unwinder issues,
    but they often don't give enough data to be able to fully diagnose them.
    Print a one-time stack dump when a warning is detected.

    Signed-off-by: Josh Poimboeuf
    Cc: Borislav Petkov
    Cc: Andy Lutomirski
    Link: http://lkml.kernel.org/r/15607370e3ddb1732b6a73d5c65937864df16ac8.1481904011.git.jpoimboe@redhat.com
    Signed-off-by: Thomas Gleixner

    Josh Poimboeuf
     
  • Somehow, CONFIG_PARAVIRT=n convinces gcc to change the
    x86_64_start_kernel() prologue from:

    0000000000000129 :
    129: 55 push %rbp
    12a: 48 89 e5 mov %rsp,%rbp

    to:

    0000000000000124 :
    124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
    129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
    12d: 41 ff 72 f8 pushq -0x8(%r10)
    131: 55 push %rbp
    132: 48 89 e5 mov %rsp,%rbp

    This is an unusual pattern which aligns rsp (though in this case it's
    already aligned) and saves the start_cpu() return address again on the
    stack before storing the frame pointer.

    The unwinder assumes the last stack frame header is at a certain offset,
    but the above code breaks that assumption, resulting in the following
    warning:

    WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

    Fix it by checking for the last task stack frame at the aligned offset
    in addition to the normal unaligned offset.

    Fixes: acb4608ad186 ("x86/unwind: Create stack frames for saved syscall registers")
    Reported-by: Borislav Petkov
    Signed-off-by: Josh Poimboeuf
    Cc: Andy Lutomirski
    Link: http://lkml.kernel.org/r/9d7b4eb8cf55a7d6002cb738f25c23e7429c99a0.1481904011.git.jpoimboe@redhat.com
    Signed-off-by: Thomas Gleixner

    Josh Poimboeuf
     
  • Signed-off-by: Dmitry Torokhov
    Acked-by: Marcos Paulo de Souza
    Cc: linux-input@vger.kernel.org
    Link: http://lkml.kernel.org/r/1481317061-31486-5-git-send-email-dmitry.torokhov@gmail.com
    Signed-off-by: Thomas Gleixner

    Dmitry Torokhov
     
  • Now that i8042 uses flag in legacy platform data, i8042_detect() is
    no longer used and can be removed.

    Signed-off-by: Dmitry Torokhov
    Tested-by: Takashi Iwai
    Acked-by: Marcos Paulo de Souza
    Cc: linux-input@vger.kernel.org
    Link: http://lkml.kernel.org/r/1481317061-31486-4-git-send-email-dmitry.torokhov@gmail.com
    Signed-off-by: Thomas Gleixner

    Dmitry Torokhov
     
  • Add i8042 state to the platform data to help i8042 driver make decision
    whether to probe for i8042 or not. We recognize 3 states: platform/subarch
    ca not possible have i8042 (as is the case with Inrel MID platform),
    firmware (such as ACPI) reports that i8042 is absent from the device,
    or i8042 may be present and the driver should probe for it.

    The intent is to allow i8042 driver abort initialization on x86 if PNP data
    (absence of both keyboard and mouse PNP devices) agrees with firmware data.

    It will also allow us to remove i8042_detect later.

    Signed-off-by: Dmitry Torokhov
    Tested-by: Takashi Iwai
    Acked-by: Marcos Paulo de Souza
    Cc: linux-input@vger.kernel.org
    Link: http://lkml.kernel.org/r/1481317061-31486-2-git-send-email-dmitry.torokhov@gmail.com
    Signed-off-by: Thomas Gleixner

    Dmitry Torokhov
     
  • When CONFIG_PARAVIRT is selected, cpuid() becomes a call. Since
    for 32-bit kernels load_ucode_amd_bsp() is executed before paging
    is enabled the call cannot be completed (as kernel virtual addresses
    are not reachable yet).

    Use native_cpuid() instead which is an asm wrapper for the CPUID
    instruction.

    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Borislav Petkov
    Cc: Jürgen Gross
    Link: http://lkml.kernel.org/r/1481906392-3847-1-git-send-email-boris.ostrovsky@oracle.com
    Link: http://lkml.kernel.org/r/20161218164414.9649-5-bp@alien8.de
    Signed-off-by: Thomas Gleixner

    Boris Ostrovsky
     
  • Doing so is completely void of sense for multiple reasons so prevent
    it. Set dis_ucode_ldr to true and thus disable the microcode loader by
    default to address xen pv guests which execute the AP path but not the
    BSP path.

    By having it turned off by default, the APs won't run into the loader
    either.

    Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the
    xen pv one. That one gets the aforementioned "fix".

    Also, improve the detection method by caching the final decision whether
    to continue loading in dis_ucode_ldr and do it once on the BSP. The APs
    then simply test that value.

    Signed-off-by: Borislav Petkov
    Tested-by: Juergen Gross
    Tested-by: Boris Ostrovsky
    Acked-by: Juergen Gross
    Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.de
    Signed-off-by: Thomas Gleixner

    Borislav Petkov
     
  • Make it simply return bool to denote whether it found a container or not
    and return the pointer to the container and its size in the handed-in
    container pointer instead, as returning a struct was just silly.

    Signed-off-by: Borislav Petkov
    Cc: Jürgen Gross
    Cc: Boris Ostrovsky
    Link: http://lkml.kernel.org/r/20161218164414.9649-3-bp@alien8.de
    Signed-off-by: Thomas Gleixner

    Borislav Petkov