Commit Graph

706 Commits

Author SHA1 Message Date
Paul Mackerras
4619ac88b7 KVM: PPC: Book3S HV: Improve real-mode handling of external interrupts
This streamlines our handling of external interrupts that come in
while we're in the guest.  First, when waking up a hardware thread
that was napping, we split off the "napping due to H_CEDE" case
earlier, and use the code that handles an external interrupt (0x500)
in the guest to handle that too.  Secondly, the code that handles
those external interrupts now checks if any other thread is exiting
to the host before bouncing an external interrupt to the guest, and
also checks that there is actually an external interrupt pending for
the guest before setting the LPCR MER bit (mediated external request).

This also makes sure that we clear the "ceded" flag when we handle a
wakeup from cede in real mode, and fixes a potential infinite loop
in kvmppc_run_vcpu() which can occur if we ever end up with the ceded
flag set but MSR[EE] off.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:32 +02:00
Benjamin Herrenschmidt
e7d26f285b KVM: PPC: Book3S HV: Add support for real mode ICP in XICS emulation
This adds an implementation of the XICS hypercalls in real mode for HV
KVM, which allows us to avoid exiting the guest MMU context on all
threads for a variety of operations such as fetching a pending
interrupt, EOI of messages, IPIs, etc.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:32 +02:00
Benjamin Herrenschmidt
54695c3088 KVM: PPC: Book3S HV: Speed up wakeups of CPUs on HV KVM
Currently, we wake up a CPU by sending a host IPI with
smp_send_reschedule() to thread 0 of that core, which will take all
threads out of the guest, and cause them to re-evaluate their
interrupt status on the way back in.

This adds a mechanism to differentiate real host IPIs from IPIs sent
by KVM for guest threads to poke each other, in order to target the
guest threads precisely when possible and avoid that global switch of
the core to host state.

We then use this new facility in the in-kernel XICS code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:31 +02:00
Benjamin Herrenschmidt
bc5ad3f370 KVM: PPC: Book3S: Add kernel emulation for the XICS interrupt controller
This adds in-kernel emulation of the XICS (eXternal Interrupt
Controller Specification) interrupt controller specified by PAPR, for
both HV and PR KVM guests.

The XICS emulation supports up to 1048560 interrupt sources.
Interrupt source numbers below 16 are reserved; 0 is used to mean no
interrupt and 2 is used for IPIs.  Internally these are represented in
blocks of 1024, called ICS (interrupt controller source) entities, but
that is not visible to userspace.

Each vcpu gets one ICP (interrupt controller presentation) entity,
used to store the per-vcpu state such as vcpu priority, pending
interrupt state, IPI request, etc.

This does not include any API or any way to connect vcpus to their
ICP state; that will be added in later patches.

This is based on an initial implementation by Michael Ellerman
<michael@ellerman.id.au> reworked by Benjamin Herrenschmidt and
Paul Mackerras.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix typo, add dependency on !KVM_MPIC]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:30 +02:00
Michael Ellerman
8e591cb720 KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls
For pseries machine emulation, in order to move the interrupt
controller code to the kernel, we need to intercept some RTAS
calls in the kernel itself.  This adds an infrastructure to allow
in-kernel handlers to be registered for RTAS services by name.
A new ioctl, KVM_PPC_RTAS_DEFINE_TOKEN, then allows userspace to
associate token values with those service names.  Then, when the
guest requests an RTAS service with one of those token values, it
will be handled by the relevant in-kernel handler rather than being
passed up to userspace as at present.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix warning]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:29 +02:00
Scott Wood
91194919a6 kvm/ppc/mpic: Eliminate mmio_mapped
We no longer need to keep track of this now that MPIC destruction
always happens either during VM destruction (after MMIO has been
destroyed) or during a failed creation (before the fd has been exposed
to userspace, and thus before the MMIO region could have been
registered).

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:28 +02:00
Scott Wood
07f0a7bdec kvm: destroy emulated devices on VM exit
The hassle of getting refcounting right was greater than the hassle
of keeping a list of devices to destroy on VM exit.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:28 +02:00
Alexander Graf
447a03c02a KVM: PPC: MPIC: Restrict to e500 platforms
The code as is doesn't make any sense on non-e500 platforms. Restrict it
there, so that people don't get wrong ideas on what would actually work.

This patch should get reverted as soon as it's possible to either run e500
guests on non-e500 hosts or the MPIC emulation gains support for non-e500
modes.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:26 +02:00
Alexander Graf
5efdb4be59 KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
Now that all pieces are in place for reusing generic irq infrastructure,
we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
reuse it for PPC, as it will work there just as well.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:25 +02:00
Alexander Graf
de9ba2f363 KVM: PPC: Support irq routing and irqfd for in-kernel MPIC
Now that all the irq routing and irqfd pieces are generic, we can expose
real irqchip support to all of KVM's internal helpers.

This allows us to use irqfd with the in-kernel MPIC.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:25 +02:00
Scott Wood
eb1e4f43e0 kvm/ppc/mpic: add KVM_CAP_IRQ_MPIC
Enabling this capability connects the vcpu to the designated in-kernel
MPIC.  Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:24 +02:00
Scott Wood
5df554ad5b kvm/ppc/mpic: in-kernel MPIC emulation
Hook the MPIC code up to the KVM interfaces, add locking, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:23 +02:00
Scott Wood
f0f5c481a9 kvm/ppc/mpic: adapt to kernel style and environment
Remove braces that Linux style doesn't permit, remove space after
'*' that Lindent added, keep error/debug strings contiguous, etc.

Substitute type names, debug prints, etc.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:22 +02:00
Scott Wood
6dd830a09a kvm/ppc/mpic: remove some obviously unneeded code
Remove some parts of the code that are obviously QEMU or Raven specific
before fixing style issues, to reduce the style issues that need to be
fixed.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:22 +02:00
Scott Wood
b823f98f89 kvm/ppc/mpic: import hw/openpic.c from QEMU
This is QEMU's hw/openpic.c from commit
abd8d4a4d6dfea7ddea72f095f993e1de941614e ("Update version for
1.4.0-rc0"), run through Lindent with no other changes to ease merging
future changes between Linux and QEMU.  Remaining style issues
(including those introduced by Lindent) will be fixed in a later patch.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:20 +02:00
Paul Mackerras
c35635efdc KVM: PPC: Book3S HV: Report VPA and DTL modifications in dirty map
At present, the KVM_GET_DIRTY_LOG ioctl doesn't report modifications
done by the host to the virtual processor areas (VPAs) and dispatch
trace logs (DTLs) registered by the guest.  This is because those
modifications are done either in real mode or in the host kernel
context, and in neither case does the access go through the guest's
HPT, and thus no change (C) bit gets set in the guest's HPT.

However, the changes done by the host do need to be tracked so that
the modified pages get transferred when doing live migration.  In
order to track these modifications, this adds a dirty flag to the
struct representing the VPA/DTL areas, and arranges to set the flag
when the VPA/DTL gets modified by the host.  Then, when we are
collecting the dirty log, we also check the dirty flags for the
VPA and DTL for each vcpu and set the relevant bit in the dirty log
if necessary.  Doing this also means we now need to keep track of
the guest physical address of the VPA/DTL areas.

So as not to lose track of modifications to a VPA/DTL area when it gets
unregistered, or when a new area gets registered in its place, we need
to transfer the dirty state to the rmap chain.  This adds code to
kvmppc_unpin_guest_page() to do that if the area was dirty.  To simplify
that code, we now require that all VPA, DTL and SLB shadow buffer areas
fit within a single host page.  Guests already comply with this
requirement because pHyp requires that these areas not cross a 4k
boundary.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:13 +02:00
Paul Mackerras
a1b4a0f606 KVM: PPC: Book3S HV: Make HPT reading code notice R/C bit changes
At present, the code that determines whether a HPT entry has changed,
and thus needs to be sent to userspace when it is copying the HPT,
doesn't consider a hardware update to the reference and change bits
(R and C) in the HPT entries to constitute a change that needs to
be sent to userspace.  This adds code to check for changes in R and C
when we are scanning the HPT to find changed entries, and adds code
to set the changed flag for the HPTE when we update the R and C bits
in the guest view of the HPTE.

Since we now need to set the HPTE changed flag in book3s_64_mmu_hv.c
as well as book3s_hv_rm_mmu.c, we move the note_hpte_modification()
function into kvm_book3s_64.h.

Current Linux guest kernels don't use the hardware updates of R and C
in the HPT, so this change won't affect them.  Linux (or other) kernels
might in future want to use the R and C bits and have them correctly
transferred across when a guest is migrated, so it is better to correct
this deficiency.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:12 +02:00
Mihai Caraman
d9ce6041b3 KVM: PPC: e500: Add e6500 core to Kconfig description
Add e6500 core to Kconfig description.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:11 +02:00
Mihai Caraman
ea17a971c2 KVM: PPC: e500mc: Enable e6500 cores
Extend processor compatibility names to e6500 cores.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:10 +02:00
Mihai Caraman
5b21501045 KVM: PPC: e500: Remove E.PT and E.HV.LRAT categories from VCPUs
Embedded.Page Table (E.PT) category is not supported yet in e6500 kernel.
Configure TLBnCFG to remove E.PT and E.HV.LRAT categories from VCPUs.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:09 +02:00
Mihai Caraman
9a6061d7fd KVM: PPC: e500: Add support for EPTCFG register
EPTCFG register defined by E.PT is accessed unconditionally by Linux guests
in the presence of MAV 2.0. Emulate it now.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:08 +02:00
Mihai Caraman
307d9008ed KVM: PPC: e500: Add support for TLBnPS registers
Add support for TLBnPS registers available in MMU Architecture Version
(MAV) 2.0.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:07 +02:00
Mihai Caraman
8893a188b1 KVM: PPC: e500: Move vcpu's MMU configuration to dedicated functions
Vcpu's MMU default configuration and geometry update logic was buried in
a chunk of code. Move them to dedicated functions to add more clarity.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:07 +02:00
Mihai Caraman
a85d2aa23e KVM: PPC: e500: Expose MMU registers via ONE_REG
MMU registers were exposed to user-space using sregs interface. Add them
to ONE_REG interface using kvmppc_get_one_reg/kvmppc_set_one_reg delegation
mechanism.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:06 +02:00
Mihai Caraman
35b299e279 KVM: PPC: Book3E: Refactor ONE_REG ioctl implementation
Refactor Book3E ONE_REG ioctl implementation to use kvmppc_get_one_reg/
kvmppc_set_one_reg delegation interface introduced by Book3S. This is
necessary for MMU SPRs which are platform specifics.

Get rid of useless case braces in the process.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:05 +02:00
Bharat Bhushan
9b4f530807 booke: exit to user space if emulator request
This allows the exit to user space if emulator request by returning
EMULATE_EXIT_USER. This will be used in subsequent patches in list

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:04 +02:00
Bharat Bhushan
0f47f9b517 KVM: extend EMULATE_EXIT_USER to support different exit reasons
Currently the instruction emulator code returns EMULATE_EXIT_USER
and common code initializes the "run->exit_reason = .." and
"vcpu->arch.hcall_needed = .." with one fixed reason.
But there can be different reasons when emulator need to exit
to user space. To support that the "run->exit_reason = .."
and "vcpu->arch.hcall_needed = .." initialization is moved a
level up to emulator.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:03 +02:00
Bharat Bhushan
c402a3f457 Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER
Instruction emulation return EMULATE_DO_PAPR when it requires
exit to userspace on book3s. Similar return is required
for booke. EMULATE_DO_PAPR reads out to be confusing so it is
renamed to EMULATE_EXIT_USER.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:03 +02:00
Bharat Bhushan
092d62ee93 KVM: PPC: debug stub interface parameter defined
This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
ioctl support. Follow up patches will use this for setting up
hardware breakpoints, watchpoints and software breakpoints.

Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
This is because I am not sure what is required for book3s. So this ioctl
behaviour will not change for book3s.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-26 20:27:02 +02:00
Scott Wood
be28a27c99 kvm/ppc: don't call complete_mmio_load when it's a store
complete_mmio_load writes back the mmio result into the
destination register.  Doing this on a store results in
register corruption.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-17 15:21:16 +02:00
Stuart Yoder
c32498ee64 KVM: PPC: emulate dcbst
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-17 15:21:15 +02:00
Bharat Bhushan
8c32a2ea65 Added ONE_REG interface for debug instruction
This patch adds the one_reg interface to get the special instruction
to be used for setting software breakpoint from userspace.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-17 15:21:14 +02:00
Paul Mackerras
4fe27d2add KVM: PPC: Remove unused argument to kvmppc_core_dequeue_external
Currently kvmppc_core_dequeue_external() takes a struct kvm_interrupt *
argument and does nothing with it, in any of its implementations.
This removes it in order to make things easier for forthcoming
in-kernel interrupt controller emulation code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:17 +01:00
Scott Wood
47bf379742 kvm/ppc/e500: eliminate tlb_refs
Commit 523f0e5421 ("KVM: PPC: E500:
Explicitly mark shadow maps invalid") began using E500_TLB_VALID
for guest TLB1 entries, and skipping invalidations if it's not set.

However, when E500_TLB_VALID was set for such entries, it was on a
fake local ref, and so the invalidations never happen.  gtlb_privs
is documented as being only for guest TLB0, though we already violate
that with E500_TLB_BITMAP.

Now that we have MMU notifiers, and thus don't need to actually
retain a reference to the mapped pages, get rid of tlb_refs, and
use gtlb_privs for E500_TLB_VALID in TLB1.

Since we can have more than one host TLB entry for a given tlbe_ref,
be careful not to clear existing flags that are relevant to other
host TLB entries when preparing a new host TLB entry.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:15 +01:00
Scott Wood
36ada4f431 kvm/ppc/e500: g2h_tlb1_map: clear old bit before setting new bit
It's possible that we're using the same host TLB1 slot to map (a
presumably different portion of) the same guest TLB1 entry.  Clear
the bit in the map before setting it, so that if the esels are the same
the bit will remain set.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:13 +01:00
Scott Wood
d6940b6416 kvm/ppc/e500: h2g_tlb1_rmap: esel 0 is valid
Add one to esel values in h2g_tlb1_rmap, so that "no mapping" can be
distinguished from "esel 0".  Note that we're not saved by the fact
that host esel 0 is reserved for non-KVM use, because KVM host esel
numbering is not the raw host numbering (see to_htlb1_esel).

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:11 +01:00
Bharat Bhushan
15b708beee KVM: PPC: booke: Added debug handler
Installed debug handler will be used for guest debug support
and debug facility emulation features (patches for these
features will follow this patch).

Signed-off-by: Liu Yu <yu.liu@freescale.com>
[bharat.bhushan@freescale.com: Substantial changes]
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:09 +01:00
Bharat Bhushan
78accda4f8 KVM: PPC: Added one_reg interface for timer registers
If userspace wants to change some specific bits of TSR
(timer status register) then it uses GET/SET_SREGS ioctl interface.
So the steps will be:
      i)   user-space will make get ioctl,
      ii)  change TSR in userspace
      iii) then make set ioctl.
It can happen that TSR gets changed by kernel after step i) and
before step iii).

To avoid this we have added below one_reg ioctls for oring and clearing
specific bits in TSR. This patch adds one registerface for:
     1) setting specific bit in TSR (timer status register)
     2) clearing specific bit in TSR (timer status register)
     3) setting/getting the TCR register. There are cases where we want to only
        change TCR and not TSR. Although we can uses SREGS without
        KVM_SREGS_E_UPDATE_TSR flag but I think one reg is better. I am open
        if someone feels we should use SREGS only here.
     4) getting/setting TSR register

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:06 +01:00
Bharat Bhushan
d26f22c9cd KVM: PPC: move tsr update in a separate function
This is done so that same function can be called from SREGS and
ONE_REG interface (follow up patch).

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-03-22 01:21:05 +01:00
Marcelo Tosatti
2ae33b3896 Merge remote-tracking branch 'upstream/master' into queue
Merge reason:

From: Alexander Graf <agraf@suse.de>

"Just recently this really important patch got pulled into Linus' tree for 3.9:

commit 1674400aae
Author: Anton Blanchard <anton <at> samba.org>
Date:   Tue Mar 12 01:51:51 2013 +0000

Without that commit, I can not boot my G5, thus I can't run automated tests on it against my queue.

Could you please merge kvm/next against linus/master, so that I can base my trees against that?"

* upstream/master: (653 commits)
  PCI: Use ROM images from firmware only if no other ROM source available
  sparc: remove unused "config BITS"
  sparc: delete "if !ULTRA_HAS_POPULATION_COUNT"
  KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)
  KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797)
  KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796)
  arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS
  arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED
  inet: limit length of fragment queue hash table bucket lists
  qeth: Fix scatter-gather regression
  qeth: Fix invalid router settings handling
  qeth: delay feature trace
  sgy-cts1000: Remove __dev* attributes
  KVM: x86: fix deadlock in clock-in-progress request handling
  KVM: allow host header to be included even for !CONFIG_KVM
  hwmon: (lm75) Fix tcn75 prefix
  hwmon: (lm75.h) Update header inclusion
  MAINTAINERS: Remove Mark M. Hoffman
  xfs: ensure we capture IO errors correctly
  xfs: fix xfs_iomap_eof_prealloc_initial_size type
  ...

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-21 11:11:52 -03:00
Aneesh Kumar K.V
af81d7878c powerpc: Rename USER_ESID_BITS* to ESID_BITS*
Now we use ESID_BITS of kernel address to build proto vsid. So rename
USER_ESIT_BITS to ESID_BITS

Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org> [v3.8]
2013-03-17 12:45:44 +11:00
Takuya Yoshikawa
8482644aea KVM: set_memory_region: Refactor commit_memory_region()
This patch makes the parameter old a const pointer to the old memory
slot and adds a new parameter named change to know the change being
requested: the former is for removing extra copying and the latter is
for cleaning up the code.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Takuya Yoshikawa
7b6195a91d KVM: set_memory_region: Refactor prepare_memory_region()
This patch drops the parameter old, a copy of the old memory slot, and
adds a new parameter named change to know the change being requested.

This not only cleans up the code but also removes extra copying of the
memory slot structure.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Takuya Yoshikawa
462fce4606 KVM: set_memory_region: Drop user_alloc from prepare/commit_memory_region()
X86 does not use this any more.  The remaining user, s390's !user_alloc
check, can be simply removed since KVM_SET_MEMORY_REGION ioctl is no
longer supported.

Note: fixed powerpc's indentations with spaces to suppress checkpatch
errors.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-04 20:21:08 -03:00
Sasha Levin
b67bfe0d42 hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:24 -08:00
Linus Torvalds
89f883372f Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Marcelo Tosatti:
 "KVM updates for the 3.9 merge window, including x86 real mode
  emulation fixes, stronger memory slot interface restrictions, mmu_lock
  spinlock hold time reduction, improved handling of large page faults
  on shadow, initial APICv HW acceleration support, s390 channel IO
  based virtio, amongst others"

* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  Revert "KVM: MMU: lazily drop large spte"
  x86: pvclock kvm: align allocation size to page size
  KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
  x86 emulator: fix parity calculation for AAD instruction
  KVM: PPC: BookE: Handle alignment interrupts
  booke: Added DBCR4 SPR number
  KVM: PPC: booke: Allow multiple exception types
  KVM: PPC: booke: use vcpu reference from thread_struct
  KVM: Remove user_alloc from struct kvm_memory_slot
  KVM: VMX: disable apicv by default
  KVM: s390: Fix handling of iscs.
  KVM: MMU: cleanup __direct_map
  KVM: MMU: remove pt_access in mmu_set_spte
  KVM: MMU: cleanup mapping-level
  KVM: MMU: lazily drop large spte
  KVM: VMX: cleanup vmx_set_cr0().
  KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
  KVM: VMX: disable SMEP feature when guest is in non-paging mode
  KVM: Remove duplicate text in api.txt
  Revert "KVM: MMU: split kvm_mmu_free_page"
  ...
2013-02-24 13:07:18 -08:00
Linus Torvalds
9d3cae26ac Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Benjamin Herrenschmidt:
 "So from the depth of frozen Minnesota, here's the powerpc pull request
  for 3.9.  It has a few interesting highlights, in addition to the
  usual bunch of bug fixes, minor updates, embedded device tree updates
  and new boards:

   - Hand tuned asm implementation of SHA1 (by Paulus & Michael
     Ellerman)

   - Support for Doorbell interrupts on Power8 (kind of fast
     thread-thread IPIs) by Ian Munsie

   - Long overdue cleanup of the way we handle relocation of our open
     firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard

   - Support for saving/restoring & context switching the PPR (Processor
     Priority Register) on server processors that support it.  This
     allows the kernel to preserve thread priorities established by
     userspace.  By Haren Myneni.

   - DAWR (new watchpoint facility) support on Power8 by Michael Neuling

   - Ability to change the DSCR (Data Stream Control Register) which
     controls cache prefetching on a running process via ptrace by
     Alexey Kardashevskiy

   - Support for context switching the TAR register on Power8 (new
     branch target register meant to be used by some new specific
     userspace perf event interrupt facility which is yet to be enabled)
     by Ian Munsie.

   - Improve preservation of the CFAR register (which captures the
     origin of a branch) on various exception conditions by Paulus.

   - Move the Bestcomm DMA driver from arch powerpc to drivers/dma where
     it belongs by Philippe De Muyter

   - Support for Transactional Memory on Power8 by Michael Neuling
     (based on original work by Matt Evans).  For those curious about
     the feature, the patch contains a pretty good description."

(See commit db8ff90702: "powerpc: Documentation for transactional
memory on powerpc" for the mentioned description added to the file
Documentation/powerpc/transactional_memory.txt)

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (140 commits)
  powerpc/kexec: Disable hard IRQ before kexec
  powerpc/85xx: l2sram - Add compatible string for BSC9131 platform
  powerpc/85xx: bsc9131 - Correct typo in SDHC device node
  powerpc/e500/qemu-e500: enable coreint
  powerpc/mpic: allow coreint to be determined by MPIC version
  powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr struct
  powerpc/85xx: Board support for ppa8548
  powerpc/fsl: remove extraneous DIU platform functions
  arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate test
  powerpc: Documentation for transactional memory on powerpc
  powerpc: Add transactional memory to pseries and ppc64 defconfigs
  powerpc: Add config option for transactional memory
  powerpc: Add transactional memory to POWER8 cpu features
  powerpc: Add new transactional memory state to the signal context
  powerpc: Hook in new transactional memory code
  powerpc: Routines for FP/VSX/VMX unavailable during a transaction
  powerpc: Add transactional memory unavaliable execption handler
  powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes
  powerpc: Add FP/VSX and VMX register load functions for transactional memory
  powerpc: Add helper functions for transactional memory context switching
  ...
2013-02-23 17:09:55 -08:00
Paul Mackerras
deb26c274d powerpc/kvm/book3s_pr: Fix compilation on 32-bit machines
Commit a413f474a0 ("powerpc: Disable relocation on exceptions whenever
PR KVM is active") added calls to pSeries_disable_reloc_on_exc() and
pSeries_enable_reloc_on_exc() to book3s_pr.c, and added declarations
of those functions to <asm/hvcall.h>, but didn't add an include of
<asm/hvcall.h> to book3s_pr.c.  64-bit kernels seem to get hvcall.h
included via some other path, but 32-bit kernels fail to compile with:

arch/powerpc/kvm/book3s_pr.c: In function ‘kvmppc_core_init_vm’:
arch/powerpc/kvm/book3s_pr.c:1300:4: error: implicit declaration of function ‘pSeries_disable_reloc_on_exc’ [-Werror=implicit-function-declaration]
arch/powerpc/kvm/book3s_pr.c: In function ‘kvmppc_core_destroy_vm’:
arch/powerpc/kvm/book3s_pr.c:1316:4: error: implicit declaration of function ‘pSeries_enable_reloc_on_exc’ [-Werror=implicit-function-declaration]
cc1: all warnings being treated as errors
make[2]: *** [arch/powerpc/kvm/book3s_pr.o] Error 1
make[1]: *** [arch/powerpc/kvm] Error 2
make: *** [sub-make] Error 2

This fixes it by adding an include of hvcall.h.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-15 16:54:36 +11:00
Paul Mackerras
0acb91112a powerpc/kvm/book3s_hv: Preserve guest CFAR register value
The CFAR (Come-From Address Register) is a useful debugging aid that
exists on POWER7 processors.  Currently HV KVM doesn't save or restore
the CFAR register for guest vcpus, making the CFAR of limited use in
guests.

This adds the necessary code to capture the CFAR value saved in the
early exception entry code (it has to be saved before any branch is
executed), save it in the vcpu.arch struct, and restore it on entry
to the guest.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-15 16:54:33 +11:00
Alexander Graf
011da89962 KVM: PPC: BookE: Handle alignment interrupts
When the guest triggers an alignment interrupt, we don't handle it properly
today and instead BUG_ON(). This really shouldn't happen.

Instead, we should just pass the interrupt back into the guest so it can deal
with it.

Reported-by: Gao Guanhua-B22826 <B22826@freescale.com>
Tested-by: Gao Guanhua-B22826 <B22826@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-02-13 12:56:45 +01:00