Eric Lee / smarc-fsl-linux-kernel

27 Dec, 2016

2 commits

50fb57042 crypto: aesni-intel - R FC4106 can zero copy when !PageHighMem ... Browse Code »

In the common case of !PageHighMem we can do zero copy crypto
even if sg crosses a pages boundary.

Signed-off-by: Ilya Lesokhin
Signed-off-by: Herbert Xu

Ilya Lesokhin
2016-12-27 17:48:48 +0800
9ae433bc7 crypto: chacha20 - convert generic and x86 versions to skcipher ... Browse Code »

This converts the ChaCha20 code from a blkcipher to a skcipher, which
is now the preferred way to implement symmetric block and stream ciphers.

This ports the generic and x86 versions at the same time because the
latter reuses routines of the former.

Note that the skcipher_walk() API guarantees that all presented blocks
except the final one are a multiple of the chunk size, so we can simplify
the encrypt() routine somewhat.

Signed-off-by: Ard Biesheuvel
Signed-off-by: Herbert Xu

Ard Biesheuvel
2016-12-27 17:47:31 +0800

26 Dec, 2016

3 commits

3ddc76dfc Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer type cleanups from Thomas Gleixner:
"This series does a tree wide cleanup of types related to
timers/timekeeping.

- Get rid of cycles_t and use a plain u64. The type is not really
helpful and caused more confusion than clarity

- Get rid of the ktime union. The union has become useless as we use
the scalar nanoseconds storage unconditionally now. The 32bit
timespec alike storage got removed due to the Y2038 limitations
some time ago.

That leaves the odd union access around for no reason. Clean it up.

Both changes have been done with coccinelle and a small amount of
manual mopping up"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Get rid of ktime_equal()
ktime: Cleanup ktime_set() usage
ktime: Get rid of the union
clocksource: Use a plain u64 instead of cycle_t

Linus Torvalds
2016-12-26 06:30:04 +0800
b272f732f Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull SMP hotplug notifier removal from Thomas Gleixner:
"This is the final cleanup of the hotplug notifier infrastructure. The
series has been reintgrated in the last two days because there came a
new driver using the old infrastructure via the SCSI tree.

Summary:

- convert the last leftover drivers utilizing notifiers

- fixup for a completely broken hotplug user

- prevent setup of already used states

- removal of the notifiers

- treewide cleanup of hotplug state names

- consolidation of state space

There is a sphinx based documentation pending, but that needs review
from the documentation folks"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/armada-xp: Consolidate hotplug state space
irqchip/gic: Consolidate hotplug state space
coresight/etm3/4x: Consolidate hotplug state space
cpu/hotplug: Cleanup state names
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
staging/lustre/libcfs: Convert to hotplug state machine
scsi/bnx2i: Convert to hotplug state machine
scsi/bnx2fc: Convert to hotplug state machine
cpu/hotplug: Prevent overwriting of callbacks
x86/msr: Remove bogus cleanup from the error path
bus: arm-ccn: Prevent hotplug callback leak
perf/x86/intel/cstate: Prevent hotplug callback leak
ARM/imx/mmcd: Fix broken cpu hotplug handling
scsi: qedi: Convert to hotplug state machine

Linus Torvalds
2016-12-26 06:05:56 +0800
8b0e19531 ktime: Cleanup ktime_set() usage ... Browse Code »

ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra

Thomas Gleixner
2016-12-26 00:21:22 +0800

25 Dec, 2016

5 commits

a5a1d1c29 clocksource: Use a plain u64 instead of cycle_t ... Browse Code »

There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: John Stultz

Thomas Gleixner
2016-12-25 18:04:12 +0800
73c1b41e6 cpu/hotplug: Cleanup state names ... Browse Code »

When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.

Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.

Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Sebastian Siewior
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2016-12-25 17:47:44 +0800
59fefd089 x86/msr: Remove bogus cleanup from the error path ... Browse Code »

The error cleanup which is invoked when the hotplug state setup failed
tries to remove the failed state, which is broken.

Fixes: 8fba38c937cd ("x86/msr: Convert to hotplug state machine")
Reported-by: kernel test robot
Signed-off-by: Thomas Gleixner
Cc: Sebastian Siewior

Thomas Gleixner
2016-12-25 17:47:41 +0800
834fcd298 perf/x86/intel/cstate: Prevent hotplug callback leak ... Browse Code »

If the pmu registration fails the registered hotplug callbacks are not
removed. Wrong in any case, but fatal in case of a modular driver.

Replace the nonsensical state names with proper ones while at it.

Fixes: 77c34ef1c319 ("perf/x86/intel/cstate: Convert Intel CSTATE to hotplug state machine")
Signed-off-by: Thomas Gleixner
Cc: Sebastian Siewior
Cc: Peter Zijlstra
Cc: stable@vger.kernel.org

Thomas Gleixner
2016-12-25 17:47:40 +0800
7c0f6ba68 Replace <asm/uaccess.h> with <linux/uaccess.h> globally ... Browse Code »

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-12-25 03:46:01 +0800

24 Dec, 2016

3 commits

6ac3bb167 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 fixes from Ingo Molnar:
"There's a number of fixes:

- a round of fixes for CPUID-less legacy CPUs
- a number of microcode loader fixes
- i8042 detection robustization fixes
- stack dump/unwinder fixes
- x86 SoC platform driver fixes
- a GCC 7 warning fix
- virtualization related fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
Revert "x86/unwind: Detect bad stack return address"
x86/paravirt: Mark unused patch_default label
x86/microcode/AMD: Reload proper initrd start address
x86/platform/intel/quark: Add printf attribute to imr_self_test_result()
x86/platform/intel-mid: Switch MPU3050 driver to IIO
x86/alternatives: Do not use sync_core() to serialize I$
x86/topology: Document cpu_llc_id
x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic
x86/asm: Rewrite sync_core() to use IRET-to-self
x86/microcode/intel: Replace sync_core() with native_cpuid()
Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6
x86/tools: Fix gcc-7 warning in relocs.c
x86/unwind: Dump stack data on warnings
x86/unwind: Adjust last frame check for aligned function stacks
x86/init: Fix a couple of comment typos
x86/init: Remove i8042_detect() from platform ops
Input: i8042 - Trust firmware a bit more when probing on X86
x86/init: Add i8042 state to the platform data
...

Linus Torvalds
2016-12-24 08:54:46 +0800
00198dab3 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"On the kernel side there's two x86 PMU driver fixes and a uprobes fix,
plus on the tooling side there's a number of fixes and some late
updates"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf sched timehist: Fix invalid period calculation
perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
perf sched timehist: Enlarge default 'comm_width'
perf sched timehist: Honour 'comm_width' when aligning the headers
perf/x86: Fix overlap counter scheduling bug
perf/x86/pebs: Fix handling of PEBS buffer overflows
samples/bpf: Move open_raw_sock to separate header
samples/bpf: Remove perf_event_open() declaration
samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
tools lib bpf: Add bpf_prog_{attach,detach}
samples/bpf: Switch over to libbpf
perf diff: Do not overwrite valid build id
perf annotate: Don't throw error for zero length symbols
perf bench futex: Fix lock-pi help string
perf trace: Check if MAP_32BIT is defined (again)
samples/bpf: Make perf_event_read() static
uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation
samples/bpf: Make samples more libbpf-centric
tools lib bpf: Add flags to bpf_create_map()
tools lib bpf: use __u32 from linux/types.h
...

Linus Torvalds
2016-12-24 08:49:12 +0800
c280f7736 Revert "x86/unwind: Detect bad stack return address" ... Browse Code »

Revert the following commit:

b6959a362177 ("x86/unwind: Detect bad stack return address")

... because Andrey Konovalov reported an unwinder warning:

WARNING: unrecognized kernel stack return address ffffffffa0000001 at ffff88006377fa18 in a.out:4467

The unwind was initiated from an interrupt which occurred while running in the
generated code for a kprobe. The unwinder printed the warning because it
expected regs->ip to point to a valid text address, but instead it pointed to
the generated code.

Eventually we may want come up with a way to identify generated kprobe
code so the unwinder can know that it's a valid return address. Until
then, just remove the warning.

Reported-by: Andrey Konovalov
Signed-off-by: Josh Poimboeuf
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Masami Hiramatsu
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/02f296848fbf49fb72dfeea706413ecbd9d4caf6.1482418739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

Josh Poimboeuf
2016-12-24 03:32:30 +0800

23 Dec, 2016

4 commits

eb254f323 Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 cache allocation interface from Thomas Gleixner:
"This provides support for Intel's Cache Allocation Technology, a cache
partitioning mechanism.

The interface is odd, but the hardware interface of that CAT stuff is
odd as well.

We tried hard to come up with an abstraction, but that only allows
rather simple partitioning, but no way of sharing and dealing with the
per package nature of this mechanism.

In the end we decided to expose the allocation bitmaps directly so all
combinations of the hardware can be utilized.

There are two ways of associating a cache partition:

- Task

A task can be added to a resource group. It uses the cache
partition associated to the group.

- CPU

All tasks which are not member of a resource group use the group to
which the CPU they are running on is associated with.

That allows for simple CPU based partitioning schemes.

The main expected user sare:

- Virtualization so a VM can only trash only the associated part of
the cash w/o disturbing others

- Real-Time systems to seperate RT and general workloads.

- Latency sensitive enterprise workloads

- In theory this also can be used to protect against cache side
channel attacks"

[ Intel RDT is "Resource Director Technology". The interface really is
rather odd and very specific, which delayed this pull request while I
was thinking about it. The pull request itself came in early during
the merge window, I just delayed it until things had calmed down and I
had more time.

But people tell me they'll use this, and the good news is that it is
_so_ specific that it's rather independent of anything else, and no
user is going to depend on the interface since it's pretty rare. So if
push comes to shove, we can just remove the interface and nothing will
break ]

* 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
x86/intel_rdt: Implement show_options() for resctrlfs
x86/intel_rdt: Call intel_rdt_sched_in() with preemption disabled
x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
x86/intel_rdt: Fix setting of closid when adding CPUs to a group
x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee
x86/intel_rdt: Reset per cpu closids on unmount
x86/intel_rdt: Select KERNFS when enabling INTEL_RDT_A
x86/intel_rdt: Prevent deadlock against hotplug lock
x86/intel_rdt: Protect info directory from removal
x86/intel_rdt: Add info files to Documentation
x86/intel_rdt: Export the minimum number of set mask bits in sysfs
x86/intel_rdt: Propagate error in rdt_mount() properly
x86/intel_rdt: Add a missing #include
MAINTAINERS: Add maintainer for Intel RDT resource allocation
x86/intel_rdt: Add scheduler hook
x86/intel_rdt: Add schemata file
x86/intel_rdt: Add tasks files
x86/intel_rdt: Add cpus file
x86/intel_rdt: Add mkdir to resctrl file system
x86/intel_rdt: Add "info" files to resctrl file system
...

Linus Torvalds
2016-12-23 01:25:45 +0800
1134c2b5c perf/x86: Fix overlap counter scheduling bug ... Browse Code »

Jiri reported the overlap scheduling exceeding its max stack.

Looking at the constraint that triggered this, it turns out the
overlap marker isn't needed.

The comment with EVENT_CONSTRAINT_OVERLAP states: "This is the case if
the counter mask of such an event is not a subset of any other counter
mask of a constraint with an equal or higher weight".

Esp. that latter part is of interest here I think, our overlapping mask
is 0x0e, that has 3 bits set and is the highest weight mask in on the
PMU, therefore it will be placed last. Can we still create a scenario
where we would need to rewind that?

The scenario for AMD Fam15h is we're having masks like:

0x3F -- 111111
0x38 -- 111000
0x07 -- 000111

0x09 -- 001001

And we mark 0x09 as overlapping, because it is not a direct subset of
0x38 or 0x07 and has less weight than either of those. This means we'll
first try and place the 0x09 event, then try and place 0x38/0x07 events.
Now imagine we have:

3 * 0x07 + 0x09

and the initial pick for the 0x09 event is counter 0, then we'll fail to
place all 0x07 events. So we'll pop back, try counter 4 for the 0x09
event, and then re-try all 0x07 events, which will now work.

The masks on the PMU in question are:

0x01 - 0001
0x03 - 0011
0x0e - 1110
0x0c - 1100

But since all the masks that have overlap (0xe -> {0xc,0x3}) and (0x3 ->
0x1) are of heavier weight, it should all work out.

Reported-by: Jiri Olsa
Tested-by: Jiri Olsa
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Liang Kan
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Robert Richter
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Vince Weaver
Cc: Vince Weaver
Link: http://lkml.kernel.org/r/20161109155153.GQ3142@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar

Peter Zijlstra
2016-12-23 00:45:43 +0800
daa864b8f perf/x86/pebs: Fix handling of PEBS buffer overflows ... Browse Code »

This patch solves a race condition between PEBS and the PMU handler.

In case multiple PEBS events are sampled at the same time,
it is possible to have GLOBAL_STATUS bit 62 set indicating
PEBS buffer overflow and also seeing at most 3 PEBS counters
having their bits set in the status register. This is a sign
that there was at least one PEBS record pending at the time
of the PMU interrupt. PEBS counters must only be processed
via the drain_pebs() calls, and not via the regular sample
processing loop coming after that the function, otherwise
phony regular samples may be generated in the sampling buffer
not marked with the EXACT tag.

Another possibility is to have one PEBS event and at least
one non-PEBS event whic hoverflows while PEBS has armed. In this
case, bit 62 of GLOBAL_STATUS will not be set, yet the overflow
status bit for the PEBS counter will be on Skylake.

To avoid this problem, we systematically ignore the PEBS-enabled
counters from the GLOBAL_STATUS mask and we always process PEBS
events via drain_pebs().

The problem manifested itself by having non-exact samples when
sampling only PEBS events, i.e., the PERF_SAMPLE_RECORD would
not have the EXACT flag set.

Note that this problem is only present on Skylake processor.
This fix is harmless on older processors.

Reported-by: Peter Zijlstra
Signed-off-by: Stephane Eranian
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Thomas Gleixner
Cc: Vince Weaver
Link: http://lkml.kernel.org/r/1482395366-8992-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar

Stephane Eranian
2016-12-23 00:45:36 +0800
cef4402d7 x86/paravirt: Mark unused patch_default label ... Browse Code »

A bugfix commit:

45dbea5f55c0 ("x86/paravirt: Fix native_patch()")

... introduced a harmless warning:

arch/x86/kernel/paravirt_patch_32.c: In function 'native_patch':
arch/x86/kernel/paravirt_patch_32.c:71:1: error: label 'patch_default' defined but not used [-Werror=unused-label]

Fix it by annotating the label as __maybe_unused.

Reported-by: Arnd Bergmann
Reported-by: Piotr Gregor
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Josh Poimboeuf
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Fixes: 45dbea5f55c0 ("x86/paravirt: Fix native_patch()")
Signed-off-by: Ingo Molnar

Peter Zijlstra
2016-12-23 00:43:35 +0800

21 Dec, 2016

1 commit

8877ebdd3 x86/microcode/AMD: Reload proper initrd start address ... Browse Code »

When we switch to virtual addresses and, especially after
reserve_initrd()->relocate_initrd() have run, we have the updated initrd
address in initrd_start. Use initrd_start then instead of the address
which has been passed to us through boot params. (That still gets used
when we're running the very early routines on the BSP).

Reported-and-tested-by: Boris Ostrovsky
Signed-off-by: Borislav Petkov
Link: http://lkml.kernel.org/r/20161220144012.lc4cwrg6dphqbyqu@pd.tnic
Signed-off-by: Thomas Gleixner

Borislav Petkov
2016-12-21 17:50:04 +0800

20 Dec, 2016

5 commits

9120cf4fd x86/platform/intel/quark: Add printf attribute to imr_self_test_result() ... Browse Code »

__printf() attributes help detecting issues in printf() format strings at
compile time.

Even though imr_selftest.c is only compiled with
CONFIG_DEBUG_IMR_SELFTEST=y, GCC complains about a missing format
attribute when compiling allmodconfig with -Wmissing-format-attribute.

Silence this warning by adding the attribute.

Signed-off-by: Nicolas Iooss
Acked-by: Bryan O'Donoghue
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20161219132144.4108-1-nicolas.iooss_linux@m4x.org
Signed-off-by: Ingo Molnar

Nicolas Iooss
2016-12-20 16:37:24 +0800
634b847b6 x86/platform/intel-mid: Switch MPU3050 driver to IIO ... Browse Code »

The Intel Mid goes in and creates a I2C device for the
MPU3050 if the input driver for MPU-3050 is activated.

As of commit:

3904b28efb2c ("iio: gyro: Add driver for the MPU-3050 gyroscope")

.. there is a proper and fully featured IIO driver for this
device, so deprecate the use of the incomplete input driver
by augmenting the device population code to react to the
presence of the IIO driver's Kconfig symbol instead.

Signed-off-by: Linus Walleij
Acked-by: Andy Shevchenko
Cc: Dmitry Torokhov
Cc: Jonathan Cameron
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1481722794-4348-1-git-send-email-linus.walleij@linaro.org
Signed-off-by: Ingo Molnar

Linus Walleij
2016-12-20 16:37:15 +0800
34bfab0ea x86/alternatives: Do not use sync_core() to serialize I$ ... Browse Code »

We use sync_core() in the alternatives code to stop speculative
execution of prefetched instructions because we are potentially changing
them and don't want to execute stale bytes.

What it does on most machines is call CPUID which is a serializing
instruction. And that's expensive.

However, the instruction cache is serialized when we're on the local CPU
and are changing the data through the same virtual address. So then, we
don't need the serializing CPUID but a simple control flow change. Last
being accomplished with a CALL/RET which the noinline causes.

Suggested-by: Linus Torvalds
Signed-off-by: Borislav Petkov
Reviewed-by: Andy Lutomirski
Cc: Andrew Cooper
Cc: Andy Lutomirski
Cc: Brian Gerst
Cc: Henrique de Moraes Holschuh
Cc: Matthew Whitehead
Cc: One Thousand Gnomes
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20161203150258.vwr5zzco7ctgc4pe@pd.tnic
Signed-off-by: Ingo Molnar

Borislav Petkov
2016-12-20 16:36:42 +0800
59107e2f4 x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic ... Browse Code »

There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
which injects NMI to the guest. We may want to crash the guest and do kdump
on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
allow the kdump kernel to re-establish VMBus connection so it will see
VMBus devices (storage, network,..).

To properly unload VMBus making it possible to start over during kdump we
need to do the following:

- Send an 'unload' message to the hypervisor. This can be done on any CPU
so we do this the crashing CPU.

- Receive the 'unload finished' reply message. WS2012R2 delivers this
message to the CPU which was used to establish VMBus connection during
module load and this CPU may differ from the CPU sending 'unload'.

Receiving a VMBus message means the following:

- There is a per-CPU slot in memory for one message. This slot can in
theory be accessed by any CPU.

- We get an interrupt on the CPU when a message was placed into the slot.

- When we read the message we need to clear the slot and signal the fact
to the hypervisor. In case there are more messages to this CPU pending
the hypervisor will deliver the next message. The signaling is done by
writing to an MSR so this can only be done on the appropriate CPU.

To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
function which checks message slots for all CPUs in a loop waiting for the
'unload finished' messages. However, there is an issue which arises when
these conditions are met:

- We're crashing on a CPU which is different from the one which was used
to initially contact the hypervisor.

- The CPU which was used for the initial contact is blocked with interrupts
disabled and there is a message pending in the message slot.

In this case we won't be able to read the 'unload finished' message on the
crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
simultaneously: the first CPU entering panic() will proceed to crash and
all other CPUs will stop themselves with interrupts disabled.

The suggested solution is to handle unknown NMIs for Hyper-V guests on the
first CPU which gets them only. This will allow us to rely on VMBus
interrupt handler being able to receive the 'unload finish' message in
case it is delivered to a different CPU.

The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.

Signed-off-by: Vitaly Kuznetsov
Acked-by: K. Y. Srinivasan
Cc: devel@linuxdriverproject.org
Cc: Haiyang Zhang
Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner
Signed-off-by: Ingo Molnar

Vitaly Kuznetsov
2016-12-20 16:31:48 +0800
45d36906e Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull KVM fixes from Paolo Bonzini:
"Early fixes for x86.

Instead of the (botched) revert, the lockdep/might_sleep splat has a
real fix provided by Andrea"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)
kvm: take srcu lock around kvm_steal_time_set_preempted()
kvm: fix schedule in atomic in kvm_steal_time_set_preempted()
KVM: hyperv: fix locking of struct kvm_hv fields
KVM: x86: Expose Intel AVX512IFMA/AVX512VBMI/SHA features to guest.
kvm: nVMX: Correct a VMX instruction error code for VMPTRLD

Linus Torvalds
2016-12-20 00:21:29 +0800

19 Dec, 2016

17 commits

ef85b6738 kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF) ... Browse Code »

When L2 exits to L0 due to "exception or NMI", software exceptions
(#BP and #OF) for which L1 has requested an intercept should be
handled by L1 rather than L0. Previously, only hardware exceptions
were forwarded to L1.

Signed-off-by: Jim Mattson
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini

Jim Mattson
2016-12-19 23:05:31 +0800
cc0d907c0 kvm: take srcu lock around kvm_steal_time_set_preempted() ... Browse Code »

kvm_memslots() will be called by kvm_write_guest_offset_cached() so
take the srcu lock.

Signed-off-by: Andrea Arcangeli
Signed-off-by: Paolo Bonzini

Andrea Arcangeli
2016-12-19 22:45:15 +0800
931f261b4 kvm: fix schedule in atomic in kvm_steal_time_set_preempted() ... Browse Code »

kvm_steal_time_set_preempted() isn't disabling the pagefaults before
calling __copy_to_user and the kernel debug notices.

Signed-off-by: Andrea Arcangeli
Signed-off-by: Paolo Bonzini

Andrea Arcangeli
2016-12-19 22:45:14 +0800
c198b121b x86/asm: Rewrite sync_core() to use IRET-to-self ... Browse Code »

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID. Use IRET-to-self
instead. IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Signed-off-by: Andy Lutomirski
Cc: Juergen Gross
Cc: One Thousand Gnomes
Cc: Peter Zijlstra
Cc: Brian Gerst
Cc: Matthew Whitehead
Cc: Borislav Petkov
Cc: Henrique de Moraes Holschuh
Cc: Andrew Cooper
Cc: Boris Ostrovsky
Cc: xen-devel
Link: http://lkml.kernel.org/r/5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner

Andy Lutomirski
2016-12-19 18:54:21 +0800
484d0e5c7 x86/microcode/intel: Replace sync_core() with native_cpuid() ... Browse Code »

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1". I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.

Reported-by: Henrique de Moraes Holschuh
Signed-off-by: Andy Lutomirski
Acked-by: Borislav Petkov
Cc: Juergen Gross
Cc: One Thousand Gnomes
Cc: Peter Zijlstra
Cc: Brian Gerst
Cc: Matthew Whitehead
Cc: Andrew Cooper
Cc: Boris Ostrovsky
Cc: xen-devel
Link: http://lkml.kernel.org/r/535a025bb91fed1a019c5412b036337ad239e5bb.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner

Andy Lutomirski
2016-12-19 18:54:21 +0800
426d1aff3 Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" ... Browse Code »

This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID. Since we no
longer require CPUID for sync_core(), just revert the patch.

I think the relevant CPUs are Geode and Elan, but I'm not sure.

In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.

Reported-by: One Thousand Gnomes
Signed-off-by: Andy Lutomirski
Cc: Juergen Gross
Cc: Denys Vlasenko
Cc: Peter Zijlstra
Cc: Brian Gerst
Cc: Josh Poimboeuf
Cc: Matthew Whitehead
Cc: Borislav Petkov
Cc: Henrique de Moraes Holschuh
Cc: Andrew Cooper
Cc: Boris Ostrovsky
Cc: xen-devel
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner

Andy Lutomirski
2016-12-19 18:54:20 +0800
1c52d859c x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels ... Browse Code »

We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong. For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.

Reported-by: One Thousand Gnomes
Signed-off-by: Andy Lutomirski
Cc: Juergen Gross
Cc: Peter Zijlstra
Cc: Brian Gerst
Cc: Matthew Whitehead
Cc: Borislav Petkov
Cc: Henrique de Moraes Holschuh
Cc: Andrew Cooper
Cc: Boris Ostrovsky
Cc: xen-devel
Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner

Andy Lutomirski
2016-12-19 18:54:20 +0800
3df8d9208 x86/cpu: Probe CPUID leaf 6 even when cpuid_level == 6 ... Browse Code »

A typo (or mis-merge?) resulted in leaf 6 only being probed if
cpuid_level >= 7.

Fixes: 2ccd71f1b278 ("x86/cpufeature: Move some of the scattered feature bits to x86_capability")
Signed-off-by: Andy Lutomirski
Acked-by: Borislav Petkov
Cc: Brian Gerst
Link: http://lkml.kernel.org/r/6ea30c0e9daec21e488b54761881a6dfcf3e04d0.1481825597.git.luto@kernel.org
Signed-off-by: Thomas Gleixner

Andy Lutomirski
2016-12-19 18:50:24 +0800
7ebb91678 x86/tools: Fix gcc-7 warning in relocs.c ... Browse Code »

gcc-7 warns:

In file included from arch/x86/tools/relocs_64.c:17:0:
arch/x86/tools/relocs.c: In function ‘process_64’:
arch/x86/tools/relocs.c:953:2: warning: argument 1 null where non-null expected [-Wnonnull]
qsort(r->offset, r->count, sizeof(r->offset[0]), cmp_relocs);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from arch/x86/tools/relocs.h:6:0,
from arch/x86/tools/relocs_64.c:1:
/usr/include/stdlib.h:741:13: note: in a call to function ‘qsort’ declared here
extern void qsort

This happens because relocs16 is not used for ELF_BITS == 64,
so there is no point in trying to sort it.

Make the sort_relocs(&relocs16) call 32bit only.

Signed-off-by: Markus Trippelsdorf
Link: http://lkml.kernel.org/r/20161215124513.GA289@x4
Signed-off-by: Thomas Gleixner

Markus Trippelsdorf
2016-12-19 18:50:24 +0800
8b5e99f02 x86/unwind: Dump stack data on warnings ... Browse Code »

The unwinder warnings are good at finding unexpected unwinder issues,
but they often don't give enough data to be able to fully diagnose them.
Print a one-time stack dump when a warning is detected.

Signed-off-by: Josh Poimboeuf
Cc: Borislav Petkov
Cc: Andy Lutomirski
Link: http://lkml.kernel.org/r/15607370e3ddb1732b6a73d5c65937864df16ac8.1481904011.git.jpoimboe@redhat.com
Signed-off-by: Thomas Gleixner

Josh Poimboeuf
2016-12-19 18:47:05 +0800
8023e0e2a x86/unwind: Adjust last frame check for aligned function stacks ... Browse Code »

Somehow, CONFIG_PARAVIRT=n convinces gcc to change the
x86_64_start_kernel() prologue from:

0000000000000129 :
129: 55 push %rbp
12a: 48 89 e5 mov %rsp,%rbp

to:

0000000000000124 :
124: 4c 8d 54 24 08 lea 0x8(%rsp),%r10
129: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
12d: 41 ff 72 f8 pushq -0x8(%r10)
131: 55 push %rbp
132: 48 89 e5 mov %rsp,%rbp

This is an unusual pattern which aligns rsp (though in this case it's
already aligned) and saves the start_cpu() return address again on the
stack before storing the frame pointer.

The unwinder assumes the last stack frame header is at a certain offset,
but the above code breaks that assumption, resulting in the following
warning:

WARNING: kernel stack frame pointer at ffffffff82e03f40 in swapper:0 has bad value (null)

Fix it by checking for the last task stack frame at the aligned offset
in addition to the normal unaligned offset.

Fixes: acb4608ad186 ("x86/unwind: Create stack frames for saved syscall registers")
Reported-by: Borislav Petkov
Signed-off-by: Josh Poimboeuf
Cc: Andy Lutomirski
Link: http://lkml.kernel.org/r/9d7b4eb8cf55a7d6002cb738f25c23e7429c99a0.1481904011.git.jpoimboe@redhat.com
Signed-off-by: Thomas Gleixner

Josh Poimboeuf
2016-12-19 18:47:05 +0800
22d3c0d63 x86/init: Fix a couple of comment typos ... Browse Code »

Signed-off-by: Dmitry Torokhov
Acked-by: Marcos Paulo de Souza
Cc: linux-input@vger.kernel.org
Link: http://lkml.kernel.org/r/1481317061-31486-5-git-send-email-dmitry.torokhov@gmail.com
Signed-off-by: Thomas Gleixner

Dmitry Torokhov
2016-12-19 18:34:16 +0800
32786fdc9 x86/init: Remove i8042_detect() from platform ops ... Browse Code »

Now that i8042 uses flag in legacy platform data, i8042_detect() is
no longer used and can be removed.

Signed-off-by: Dmitry Torokhov
Tested-by: Takashi Iwai
Acked-by: Marcos Paulo de Souza
Cc: linux-input@vger.kernel.org
Link: http://lkml.kernel.org/r/1481317061-31486-4-git-send-email-dmitry.torokhov@gmail.com
Signed-off-by: Thomas Gleixner

Dmitry Torokhov
2016-12-19 18:34:15 +0800
93ffa9a47 x86/init: Add i8042 state to the platform data ... Browse Code »

Add i8042 state to the platform data to help i8042 driver make decision
whether to probe for i8042 or not. We recognize 3 states: platform/subarch
ca not possible have i8042 (as is the case with Inrel MID platform),
firmware (such as ACPI) reports that i8042 is absent from the device,
or i8042 may be present and the driver should probe for it.

The intent is to allow i8042 driver abort initialization on x86 if PNP data
(absence of both keyboard and mouse PNP devices) agrees with firmware data.

It will also allow us to remove i8042_detect later.

Signed-off-by: Dmitry Torokhov
Tested-by: Takashi Iwai
Acked-by: Marcos Paulo de Souza
Cc: linux-input@vger.kernel.org
Link: http://lkml.kernel.org/r/1481317061-31486-2-git-send-email-dmitry.torokhov@gmail.com
Signed-off-by: Thomas Gleixner

Dmitry Torokhov
2016-12-19 18:34:15 +0800
2b4c91569 x86/microcode/AMD: Use native_cpuid() in load_ucode_amd_bsp() ... Browse Code »

When CONFIG_PARAVIRT is selected, cpuid() becomes a call. Since
for 32-bit kernels load_ucode_amd_bsp() is executed before paging
is enabled the call cannot be completed (as kernel virtual addresses
are not reachable yet).

Use native_cpuid() instead which is an asm wrapper for the CPUID
instruction.

Signed-off-by: Boris Ostrovsky
Signed-off-by: Borislav Petkov
Cc: Jürgen Gross
Link: http://lkml.kernel.org/r/1481906392-3847-1-git-send-email-boris.ostrovsky@oracle.com
Link: http://lkml.kernel.org/r/20161218164414.9649-5-bp@alien8.de
Signed-off-by: Thomas Gleixner

Boris Ostrovsky
2016-12-19 17:46:20 +0800
a15a75353 x86/microcode/AMD: Do not load when running on a hypervisor ... Browse Code »

Doing so is completely void of sense for multiple reasons so prevent
it. Set dis_ucode_ldr to true and thus disable the microcode loader by
default to address xen pv guests which execute the AP path but not the
BSP path.

By having it turned off by default, the APs won't run into the loader
either.

Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the
xen pv one. That one gets the aforementioned "fix".

Also, improve the detection method by caching the final decision whether
to continue loading in dis_ucode_ldr and do it once on the BSP. The APs
then simply test that value.

Signed-off-by: Borislav Petkov
Tested-by: Juergen Gross
Tested-by: Boris Ostrovsky
Acked-by: Juergen Gross
Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.de
Signed-off-by: Thomas Gleixner

Borislav Petkov
2016-12-19 17:46:20 +0800
200d35531 x86/microcode/AMD: Sanitize apply_microcode_early_amd() ... Browse Code »

Make it simply return bool to denote whether it found a container or not
and return the pointer to the container and its size in the handed-in
container pointer instead, as returning a struct was just silly.

Signed-off-by: Borislav Petkov
Cc: Jürgen Gross
Cc: Boris Ostrovsky
Link: http://lkml.kernel.org/r/20161218164414.9649-3-bp@alien8.de
Signed-off-by: Thomas Gleixner

Borislav Petkov
2016-12-19 17:46:20 +0800