Eric Lee / smarc-fsl-linux-kernel

17 Jan, 2013

4 commits

a843fac25 KVM: set_memory_region: Remove unnecessary variable memslot ... Browse Code »

One such variable, slot, is enough for holding a pointer temporarily.
We also remove another local variable named slot, which is limited in
a block, since it is confusing to have the same name in this function.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-17 20:27:59 +0800
0a706beef KVM: set_memory_region: Don't check for overlaps unless we create or move a slot ... Browse Code »

Don't need the check for deleting an existing slot or just modifiying
the flags.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-17 20:27:50 +0800
0ea75e1d2 KVM: set_memory_region: Don't jump to out_free unnecessarily ... Browse Code »

This makes the separation between the sanity checks and the rest of the
code a bit clearer.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-17 20:27:43 +0800
a046b816a KVM: s390: kvm/sigp.c: fix memory leakage ... Browse Code »

the variable inti should be freed in the branch CPUSTAT_STOPPED.

Signed-off-by: Cong Ding
Signed-off-by: Cornelia Huck
Signed-off-by: Gleb Natapov

Cong Ding
2013-01-17 14:41:48 +0800

14 Jan, 2013

8 commits

6b81b05e4 KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time ... Browse Code »

If the userspace starts dirty logging for a large slot, say 64GB of
memory, kvm_mmu_slot_remove_write_access() needs to hold mmu_lock for
a long time such as tens of milliseconds. This patch controls the lock
hold time by asking the scheduler if we need to reschedule for others.

One penalty for this is that we need to flush TLBs before releasing
mmu_lock. But since holding mmu_lock for a long time does affect not
only the guest, vCPU threads in other words, but also the host as a
whole, we should pay for that.

In practice, the cost will not be so high because we can protect a fair
amount of memory before being rescheduled: on my test environment,
cond_resched_lock() was called only once for protecting 12GB of memory
even without THP. We can also revisit Avi's "unlocked TLB flush" work
later for completely suppressing extra TLB flushes if needed.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-14 17:14:28 +0800
9d1beefb7 KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itself ... Browse Code »

Better to place mmu_lock handling and TLB flushing code together since
this is a self-contained function.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-14 17:14:17 +0800
b34cb590f KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself ... Browse Code »

No reason to make callers take mmu_lock since we do not need to protect
kvm_mmu_change_mmu_pages() and kvm_mmu_slot_remove_write_access()
together by mmu_lock in kvm_arch_commit_memory_region(): the former
calls kvm_mmu_commit_zap_page() and flushes TLBs by itself.

Note: we do not need to protect kvm->arch.n_requested_mmu_pages by
mmu_lock as can be seen from the fact that it is read locklessly.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-14 17:14:09 +0800
e12091ce7 KVM: Remove unused slot_bitmap from kvm_mmu_page ... Browse Code »

Not needed any more.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-14 17:13:58 +0800
b99db1d35 KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based ... Browse Code »

This makes it possible to release mmu_lock and reschedule conditionally
in a later patch. Although this may increase the time needed to protect
the whole slot when we start dirty logging, the kernel should not allow
the userspace to trigger something that will hold a spinlock for such a
long time as tens of milliseconds: actually there is no limit since it
is roughly proportional to the number of guest pages.

Another point to note is that this patch removes the only user of
slot_bitmap which will cause some problems when we increase the number
of slots further.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-14 17:13:47 +0800
245c3912e KVM: MMU: Remove unused parameter level from __rmap_write_protect() ... Browse Code »

No longer need to care about the mapping level in this function.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-14 17:13:31 +0800
c972f3b12 KVM: Write protect the updated slot only when dirty logging is enabled ... Browse Code »

Calling kvm_mmu_slot_remove_write_access() for a deleted slot does
nothing but search for non-existent mmu pages which have mappings to
that deleted memory; this is safe but a waste of time.

Since we want to make the function rmap based in a later patch, in a
manner which makes it unsafe to be called for a deleted slot, we makes
the caller see if the slot is non-zero and being dirty logged.

Reviewed-by: Marcelo Tosatti
Signed-off-by: Takuya Yoshikawa
Signed-off-by: Gleb Natapov

Takuya Yoshikawa
2013-01-14 17:13:15 +0800
aa11e3a8a Merge branch 'kvm-ppc-next' of https://github.com/agraf/linux-2.6 into queue Browse Code »

Gleb Natapov
2013-01-14 17:01:26 +0800

11 Jan, 2013

3 commits

f79ed82da KVM: trace: Fix exit decoding. ... Browse Code »

trace_kvm_userspace_exit has been missing the KVM_EXIT_WATCHDOG exit.

CC: Bharat Bhushan
Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-01-11 01:51:11 +0800
7751babd3 KVM: MMU: fix infinite fault access retry ... Browse Code »

We have two issues in current code:
- if target gfn is used as its page table, guest will refault then kvm will use
small page size to map it. We need two #PF to fix its shadow page table

- sometimes, say a exception is triggered during vm-exit caused by #PF
(see handle_exception() in vmx.c), we remove all the shadow pages shadowed
by the target gfn before go into page fault path, it will cause infinite
loop:
delete shadow pages shadowed by the gfn -> try to use large page size to map
the gfn -> retry the access ->...

To fix these, we can adjust page size early if the target gfn is used as page
table

Signed-off-by: Xiao Guangrong
Signed-off-by: Marcelo Tosatti

Xiao Guangrong
2013-01-11 01:28:30 +0800
c22885050 KVM: MMU: fix Dirty bit missed if CR0.WP = 0 ... Browse Code »

If the write-fault access is from supervisor and CR0.WP is not set on the
vcpu, kvm will fix it by adjusting pte access - it sets the W bit on pte
and clears U bit. This is the chance that kvm can change pte access from
readonly to writable

Unfortunately, the pte access is the access of 'direct' shadow page table,
means direct sp.role.access = pte_access, then we will create a writable
spte entry on the readonly shadow page table. It will cause Dirty bit is
not tracked when two guest ptes point to the same large page. Note, it
does not have other impact except Dirty bit since cr0.wp is encoded into
sp.role

It can be fixed by adjusting pte access before establishing shadow page
table. Also, after that, no mmu specified code exists in the common function
and drop two parameters in set_spte

Signed-off-by: Xiao Guangrong
Signed-off-by: Marcelo Tosatti

Xiao Guangrong
2013-01-11 01:28:08 +0800

10 Jan, 2013

17 commits

324b3e631 KVM: PPC: BookE: Add EPR ONE_REG sync ... Browse Code »

We need to be able to read and write the contents of the EPR register
from user space.

This patch implements that logic through the ONE_REG API and declares
its (never implemented) SREGS counterpart as deprecated.

Signed-off-by: Alexander Graf

Alexander Graf
2013-01-10 20:42:33 +0800
1c8106365 KVM: PPC: BookE: Implement EPR exit ... Browse Code »

The External Proxy Facility in FSL BookE chips allows the interrupt
controller to automatically acknowledge an interrupt as soon as a
core gets its pending external interrupt delivered.

Today, user space implements the interrupt controller, so we need to
check on it during such a cycle.

This patch implements logic for user space to enable EPR exiting,
disable EPR exiting and EPR exiting itself, so that user space can
acknowledge an interrupt when an external interrupt has successfully
been delivered into the guest vcpu.

Signed-off-by: Alexander Graf

Alexander Graf
2013-01-10 20:42:31 +0800
37ecb257f KVM: PPC: BookE: Emulate mfspr on EPR ... Browse Code »

The EPR register is potentially valid for PR KVM as well, so we need
to emulate accesses to it. It's only defined for reading, so only
handle the mfspr case.

Signed-off-by: Alexander Graf

Alexander Graf
2013-01-10 20:42:30 +0800
b8c649a99 KVM: PPC: BookE: Allow irq deliveries to inject requests ... Browse Code »

When injecting an interrupt into guest context, we usually don't need
to check for requests anymore. At least not until today.

With the introduction of EPR, we will have to create a request when the
guest has successfully accepted an external interrupt though.

So we need to prepare the interrupt delivery to abort guest entry
gracefully. Otherwise we'd delay the EPR request.

Signed-off-by: Alexander Graf

Alexander Graf
2013-01-10 20:42:21 +0800
f2be65500 KVM: PPC: Fix mfspr/mtspr MMUCFG emulation ... Browse Code »

On mfspr/mtspr emulation path Book3E's MMUCFG SPR with value 1015 clashes
with G4's MSSSR0 SPR. Move MSSSR0 emulation from generic part to Books3S.
MSSSR0 also clashes with Book3S's DABRX SPR. DABRX was not explicitly
handled so Book3S execution flow will behave as before.

Signed-off-by: Mihai Caraman
Signed-off-by: Alexander Graf

Mihai Caraman
2013-01-10 20:30:11 +0800
50c7bb80b KVM: PPC: Book3S: PR: Enable alternative instruction for SC 1 ... Browse Code »

When running on top of pHyp, the hypercall instruction "sc 1" goes
straight into pHyp without trapping in supervisor mode.

So if we want to support PAPR guest in this configuration we need to
add a second way of accessing PAPR hypercalls, preferably with the
exact same semantics except for the instruction.

So let's overlay an officially reserved instruction and emulate PAPR
hypercalls whenever we hit that one.

Signed-off-by: Alexander Graf

Alexander Graf
2013-01-10 20:15:08 +0800
5a33169ed KVM: PPC: Only WARN on invalid emulation ... Browse Code »

When we hit an emulation result that we didn't expect, that is an error,
but it's nothing that warrants a BUG(), because it can be guest triggered.

So instead, let's only WARN() the user that this happened.

Signed-off-by: Alexander Graf

Alexander Graf
2013-01-10 20:15:08 +0800
68e2ffed3 KVM: PPC: Fix SREGS documentation reference ... Browse Code »

Reflect the uapi folder change in SREGS API documentation.

Signed-off-by: Mihai Caraman
Reviewed-by: Amos Kong
Signed-off-by: Alexander Graf

Mihai Caraman
2013-01-10 20:15:08 +0800
b26ba22bb KVM: s390: Gracefully handle busy conditions on ccw_device_start ... Browse Code »

In rare cases a virtio command might try to issue a ccw before a former
ccw was answered with a tsch. This will cause CC=2 (busy). Lets just
retry in that case.

Signed-off-by: Christian Borntraeger
Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Christian Borntraeger
2013-01-10 04:40:19 +0800
73fa21ea4 KVM: s390: Dynamic allocation of virtio-ccw I/O data. ... Browse Code »

Dynamically allocate any data structures like ccw used when
doing channel I/O. Otherwise, we'd need to add extra serialization
for the different callbacks using the same data structures.

Reported-by: Christian Borntraeger
Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-01-10 04:40:18 +0800
fb864fbc7 KVM: x86 emulator: convert basic ALU ops to fastop ... Browse Code »

Opcodes:
TEST
CMP
ADD
ADC
SUB
SBB
XOR
OR
AND

Acked-by: Gleb Natapov
Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2013-01-10 03:39:30 +0800
f7857f35d KVM: x86 emulator: add macros for defining 2-operand fastop emulation ... Browse Code »

Acked-by: Gleb Natapov
Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2013-01-10 03:39:28 +0800
45a1467d7 KVM: x86 emulator: convert NOT, NEG to fastop ... Browse Code »

Acked-by: Gleb Natapov
Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2013-01-10 03:39:25 +0800
75f728456 KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWrite ... Browse Code »

Acked-by: Gleb Natapov
Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2013-01-10 03:39:21 +0800
b6744dc3f KVM: x86 emulator: introduce NoWrite flag ... Browse Code »

Instead of disabling writeback via OP_NONE, just specify NoWrite.

Acked-by: Gleb Natapov
Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2013-01-10 03:39:18 +0800
b7d491e7f KVM: x86 emulator: Support for declaring single operand fastops ... Browse Code »

Acked-by: Gleb Natapov
Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2013-01-10 03:39:17 +0800
e28bbd44d KVM: x86 emulator: framework for streamlining arithmetic opcodes ... Browse Code »

We emulate arithmetic opcodes by executing a "similar" (same operation,
different operands) on the cpu. This ensures accurate emulation, esp. wrt.
eflags. However, the prologue and epilogue around the opcode is fairly long,
consisting of a switch (for the operand size) and code to load and save the
operands. This is repeated for every opcode.

This patch introduces an alternative way to emulate arithmetic opcodes.
Instead of the above, we have four (three on i386) functions consisting
of just the opcode and a ret; one for each operand size. For example:

.align 8
em_notb:
not %al
ret

.align 8
em_notw:
not %ax
ret

.align 8
em_notl:
not %eax
ret

.align 8
em_notq:
not %rax
ret

The prologue and epilogue are shared across all opcodes. Note the functions
use a special calling convention; notably eflags is an input/output parameter
and is not clobbered. Rather than dispatching the four functions through a
jump table, the functions are declared as a constant size (8) so their address
can be calculated.

Acked-by: Gleb Natapov
Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2013-01-10 03:39:17 +0800

09 Jan, 2013

2 commits

b09408d00 KVM: VMX: fix incorrect cached cpl value with real/v8086 modes ... Browse Code »

CPL is always 0 when in real mode, and always 3 when virtual 8086 mode.

Using values other than those can cause failures on operations that
check CPL.

Reviewed-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti

Marcelo Tosatti
2013-01-09 03:25:35 +0800
b0cfeb5de KVM: x86: remove unused variable from walk_addr_generic() ... Browse Code »

Fix compilation warning.

Signed-off-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti

Gleb Natapov
2013-01-09 03:23:39 +0800

08 Jan, 2013

6 commits

908e7d799 KVM: MMU: simplify folding of dirty bit into accessed_dirty ... Browse Code »

MMU code tries to avoid if()s HW is not able to predict reliably by using
bitwise operation to streamline code execution, but in case of a dirty bit
folding this gives us nothing since write_fault is checked right before
the folding code. Lets just piggyback onto the if() to make code more clear.

Signed-off-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti

Gleb Natapov
2013-01-08 06:31:35 +0800
ee04e0cea KVM: mmu: remove unused trace event ... Browse Code »

trace_kvm_mmu_delay_free_pages() is no longer used.

Signed-off-by: Gleb Natapov
Signed-off-by: Marcelo Tosatti

Gleb Natapov
2013-01-08 05:54:50 +0800
fa6b7fe99 KVM: s390: Add support for channel I/O instructions. ... Browse Code »

Add a new capability, KVM_CAP_S390_CSS_SUPPORT, which will pass
intercepts for channel I/O instructions to userspace. Only I/O
instructions interacting with I/O interrupts need to be handled
in-kernel:

- TEST PENDING INTERRUPTION (tpi) dequeues and stores pending
interrupts entirely in-kernel.
- TEST SUBCHANNEL (tsch) dequeues pending interrupts in-kernel
and exits via KVM_EXIT_S390_TSCH to userspace for subchannel-
related processing.

Reviewed-by: Marcelo Tosatti
Reviewed-by: Alexander Graf
Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-01-08 05:53:43 +0800
d6712df95 KVM: s390: Base infrastructure for enabling capabilities. ... Browse Code »

Make s390 support KVM_ENABLE_CAP.

Reviewed-by: Marcelo Tosatti
Acked-by: Alexander Graf
Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-01-08 05:53:42 +0800
f379aae55 KVM: s390: In-kernel handling of I/O instructions. ... Browse Code »

Explicitely catch all channel I/O related instructions intercepts
in the kernel and set condition code 3 for them.

This paves the way for properly handling these instructions later
on.

Note: This is not architecture compliant (the previous code wasn't
either) since setting cc 3 is not the correct thing to do for some
of these instructions. For Linux guests, however, it still has the
intended effect of stopping css probing.

Reviewed-by: Marcelo Tosatti
Reviewed-by: Alexander Graf
Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-01-08 05:53:41 +0800
48a3e950f KVM: s390: Add support for machine checks. ... Browse Code »

Add support for injecting machine checks (only repressible
conditions for now).

This is a bit more involved than I/O interrupts, for these reasons:

- Machine checks come in both floating and cpu varieties.
- We don't have a bit for machine checks enabling, but have to use
a roundabout approach with trapping PSW changing instructions and
watching for opened machine checks.

Reviewed-by: Alexander Graf
Reviewed-by: Marcelo Tosatti
Signed-off-by: Cornelia Huck
Signed-off-by: Marcelo Tosatti

Cornelia Huck
2013-01-08 05:53:41 +0800