2019-05-30 07:57:35 +08:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-only
|
2007-07-06 17:20:49 +08:00
|
|
|
/*
|
|
|
|
* irq.c: API for in kernel interrupt controller
|
|
|
|
* Copyright (c) 2007, Intel Corporation.
|
2010-10-06 20:23:22 +08:00
|
|
|
* Copyright 2009 Red Hat, Inc. and/or its affiliates.
|
2007-07-06 17:20:49 +08:00
|
|
|
*
|
|
|
|
* Authors:
|
|
|
|
* Yaozu (Eddie) Dong <Eddie.dong@intel.com>
|
|
|
|
*/
|
|
|
|
|
2016-07-14 08:19:00 +08:00
|
|
|
#include <linux/export.h>
|
2007-12-16 17:02:48 +08:00
|
|
|
#include <linux/kvm_host.h>
|
2007-07-06 17:20:49 +08:00
|
|
|
|
|
|
|
#include "irq.h"
|
2008-01-28 05:10:22 +08:00
|
|
|
#include "i8254.h"
|
2009-04-21 22:44:56 +08:00
|
|
|
#include "x86.h"
|
2007-07-06 17:20:49 +08:00
|
|
|
|
2008-04-12 01:53:26 +08:00
|
|
|
/*
|
|
|
|
* check if there are pending timer events
|
|
|
|
* to be processed.
|
|
|
|
*/
|
|
|
|
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
|
|
|
|
{
|
2016-01-08 20:41:16 +08:00
|
|
|
if (lapic_in_kernel(vcpu))
|
|
|
|
return apic_has_pending_timer(vcpu);
|
|
|
|
|
|
|
|
return 0;
|
2008-04-12 01:53:26 +08:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(kvm_cpu_has_pending_timer);
|
|
|
|
|
KVM: x86: Add support for local interrupt requests from userspace
In order to enable userspace PIC support, the userspace PIC needs to
be able to inject local interrupts even when the APICs are in the
kernel.
KVM_INTERRUPT now supports sending local interrupts to an APIC when
APICs are in the kernel.
The ready_for_interrupt_request flag is now only set when the CPU/APIC
will immediately accept and inject an interrupt (i.e. APIC has not
masked the PIC).
When the PIC wishes to initiate an INTA cycle with, say, CPU0, it
kicks CPU0 out of the guest, and renedezvous with CPU0 once it arrives
in userspace.
When the CPU/APIC unmasks the PIC, a KVM_EXIT_IRQ_WINDOW_OPEN is
triggered, so that userspace has a chance to inject a PIC interrupt
if it had been pending.
Overall, this design can lead to a small number of spurious userspace
renedezvous. In particular, whenever the PIC transistions from low to
high while it is masked and whenever the PIC becomes unmasked while
it is low.
Note: this does not buffer more than one local interrupt in the
kernel, so the VMM needs to enter the guest in order to complete
interrupt injection before injecting an additional interrupt.
Compiles for x86.
Can pass the KVM Unit Tests.
Signed-off-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-30 17:27:16 +08:00
|
|
|
/*
|
|
|
|
* check if there is a pending userspace external interrupt
|
|
|
|
*/
|
|
|
|
static int pending_userspace_extint(struct kvm_vcpu *v)
|
|
|
|
{
|
|
|
|
return v->arch.pending_external_vector != -1;
|
|
|
|
}
|
|
|
|
|
2013-01-25 10:18:51 +08:00
|
|
|
/*
|
|
|
|
* check if there is pending interrupt from
|
|
|
|
* non-APIC source without intack.
|
|
|
|
*/
|
|
|
|
static int kvm_cpu_has_extint(struct kvm_vcpu *v)
|
|
|
|
{
|
KVM: x86: Rename interrupt.pending to interrupt.injected
For exceptions & NMIs events, KVM code use the following
coding convention:
*) "pending" represents an event that should be injected to guest at
some point but it's side-effects have not yet occurred.
*) "injected" represents an event that it's side-effects have already
occurred.
However, interrupts don't conform to this coding convention.
All current code flows mark interrupt.pending when it's side-effects
have already taken place (For example, bit moved from LAPIC IRR to
ISR). Therefore, it makes sense to just rename
interrupt.pending to interrupt.injected.
This change follows logic of previous commit 664f8e26b00c ("KVM: X86:
Fix loss of exception which has not yet been injected") which changed
exception to follow this coding convention as well.
It is important to note that in case !lapic_in_kernel(vcpu),
interrupt.pending usage was and still incorrect.
In this case, interrrupt.pending can only be set using one of the
following ioctls: KVM_INTERRUPT, KVM_SET_VCPU_EVENTS and
KVM_SET_SREGS. Looking at how QEMU uses these ioctls, one can see that
QEMU uses them either to re-set an "interrupt.pending" state it has
received from KVM (via KVM_GET_VCPU_EVENTS interrupt.pending or
via KVM_GET_SREGS interrupt_bitmap) or by dispatching a new interrupt
from QEMU's emulated LAPIC which reset bit in IRR and set bit in ISR
before sending ioctl to KVM. So it seems that indeed "interrupt.pending"
in this case is also suppose to represent "interrupt.injected".
However, kvm_cpu_has_interrupt() & kvm_cpu_has_injectable_intr()
is misusing (now named) interrupt.injected in order to return if
there is a pending interrupt.
This leads to nVMX/nSVM not be able to distinguish if it should exit
from L2 to L1 on EXTERNAL_INTERRUPT on pending interrupt or should
re-inject an injected interrupt.
Therefore, add a FIXME at these functions for handling this issue.
This patch introduce no semantics change.
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-23 08:01:31 +08:00
|
|
|
/*
|
2020-11-27 15:53:52 +08:00
|
|
|
* FIXME: interrupt.injected represents an interrupt whose
|
KVM: x86: Rename interrupt.pending to interrupt.injected
For exceptions & NMIs events, KVM code use the following
coding convention:
*) "pending" represents an event that should be injected to guest at
some point but it's side-effects have not yet occurred.
*) "injected" represents an event that it's side-effects have already
occurred.
However, interrupts don't conform to this coding convention.
All current code flows mark interrupt.pending when it's side-effects
have already taken place (For example, bit moved from LAPIC IRR to
ISR). Therefore, it makes sense to just rename
interrupt.pending to interrupt.injected.
This change follows logic of previous commit 664f8e26b00c ("KVM: X86:
Fix loss of exception which has not yet been injected") which changed
exception to follow this coding convention as well.
It is important to note that in case !lapic_in_kernel(vcpu),
interrupt.pending usage was and still incorrect.
In this case, interrrupt.pending can only be set using one of the
following ioctls: KVM_INTERRUPT, KVM_SET_VCPU_EVENTS and
KVM_SET_SREGS. Looking at how QEMU uses these ioctls, one can see that
QEMU uses them either to re-set an "interrupt.pending" state it has
received from KVM (via KVM_GET_VCPU_EVENTS interrupt.pending or
via KVM_GET_SREGS interrupt_bitmap) or by dispatching a new interrupt
from QEMU's emulated LAPIC which reset bit in IRR and set bit in ISR
before sending ioctl to KVM. So it seems that indeed "interrupt.pending"
in this case is also suppose to represent "interrupt.injected".
However, kvm_cpu_has_interrupt() & kvm_cpu_has_injectable_intr()
is misusing (now named) interrupt.injected in order to return if
there is a pending interrupt.
This leads to nVMX/nSVM not be able to distinguish if it should exit
from L2 to L1 on EXTERNAL_INTERRUPT on pending interrupt or should
re-inject an injected interrupt.
Therefore, add a FIXME at these functions for handling this issue.
This patch introduce no semantics change.
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-23 08:01:31 +08:00
|
|
|
* side-effects have already been applied (e.g. bit from IRR
|
|
|
|
* already moved to ISR). Therefore, it is incorrect to rely
|
|
|
|
* on interrupt.injected to know if there is a pending
|
|
|
|
* interrupt in the user-mode LAPIC.
|
|
|
|
* This leads to nVMX/nSVM not be able to distinguish
|
|
|
|
* if it should exit from L2 to L1 on EXTERNAL_INTERRUPT on
|
|
|
|
* pending interrupt or should re-inject an injected
|
|
|
|
* interrupt.
|
|
|
|
*/
|
2015-07-29 18:05:37 +08:00
|
|
|
if (!lapic_in_kernel(v))
|
KVM: x86: Rename interrupt.pending to interrupt.injected
For exceptions & NMIs events, KVM code use the following
coding convention:
*) "pending" represents an event that should be injected to guest at
some point but it's side-effects have not yet occurred.
*) "injected" represents an event that it's side-effects have already
occurred.
However, interrupts don't conform to this coding convention.
All current code flows mark interrupt.pending when it's side-effects
have already taken place (For example, bit moved from LAPIC IRR to
ISR). Therefore, it makes sense to just rename
interrupt.pending to interrupt.injected.
This change follows logic of previous commit 664f8e26b00c ("KVM: X86:
Fix loss of exception which has not yet been injected") which changed
exception to follow this coding convention as well.
It is important to note that in case !lapic_in_kernel(vcpu),
interrupt.pending usage was and still incorrect.
In this case, interrrupt.pending can only be set using one of the
following ioctls: KVM_INTERRUPT, KVM_SET_VCPU_EVENTS and
KVM_SET_SREGS. Looking at how QEMU uses these ioctls, one can see that
QEMU uses them either to re-set an "interrupt.pending" state it has
received from KVM (via KVM_GET_VCPU_EVENTS interrupt.pending or
via KVM_GET_SREGS interrupt_bitmap) or by dispatching a new interrupt
from QEMU's emulated LAPIC which reset bit in IRR and set bit in ISR
before sending ioctl to KVM. So it seems that indeed "interrupt.pending"
in this case is also suppose to represent "interrupt.injected".
However, kvm_cpu_has_interrupt() & kvm_cpu_has_injectable_intr()
is misusing (now named) interrupt.injected in order to return if
there is a pending interrupt.
This leads to nVMX/nSVM not be able to distinguish if it should exit
from L2 to L1 on EXTERNAL_INTERRUPT on pending interrupt or should
re-inject an injected interrupt.
Therefore, add a FIXME at these functions for handling this issue.
This patch introduce no semantics change.
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-03-23 08:01:31 +08:00
|
|
|
return v->arch.interrupt.injected;
|
2013-01-25 10:18:51 +08:00
|
|
|
|
2020-11-27 15:53:52 +08:00
|
|
|
if (!kvm_apic_accept_pic_intr(v))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (irqchip_split(v->kvm))
|
|
|
|
return pending_userspace_extint(v);
|
|
|
|
else
|
|
|
|
return v->kvm->arch.vpic->output;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* check if there is injectable interrupt:
|
|
|
|
* when virtual interrupt delivery enabled,
|
|
|
|
* interrupt from apic will handled by hardware,
|
|
|
|
* we don't need to check it here.
|
|
|
|
*/
|
|
|
|
int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v)
|
|
|
|
{
|
2013-01-25 10:18:51 +08:00
|
|
|
if (kvm_cpu_has_extint(v))
|
|
|
|
return 1;
|
|
|
|
|
2017-12-25 00:12:56 +08:00
|
|
|
if (!is_guest_mode(v) && kvm_vcpu_apicv_active(v))
|
2013-01-25 10:18:51 +08:00
|
|
|
return 0;
|
|
|
|
|
|
|
|
return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
|
|
|
|
}
|
2020-05-23 00:18:27 +08:00
|
|
|
EXPORT_SYMBOL_GPL(kvm_cpu_has_injectable_intr);
|
2013-01-25 10:18:51 +08:00
|
|
|
|
2007-07-06 17:20:49 +08:00
|
|
|
/*
|
|
|
|
* check if there is pending interrupt without
|
|
|
|
* intack.
|
|
|
|
*/
|
|
|
|
int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
|
|
|
|
{
|
2013-01-25 10:18:51 +08:00
|
|
|
if (kvm_cpu_has_extint(v))
|
|
|
|
return 1;
|
2012-12-10 20:05:55 +08:00
|
|
|
|
|
|
|
return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
|
2007-07-06 17:20:49 +08:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
|
|
|
|
|
2013-01-25 10:18:51 +08:00
|
|
|
/*
|
|
|
|
* Read pending interrupt(from non-APIC source)
|
|
|
|
* vector and intack.
|
|
|
|
*/
|
|
|
|
static int kvm_cpu_get_extint(struct kvm_vcpu *v)
|
|
|
|
{
|
2020-11-27 15:53:52 +08:00
|
|
|
if (!kvm_cpu_has_extint(v)) {
|
|
|
|
WARN_ON(!lapic_in_kernel(v));
|
KVM: x86: Add support for local interrupt requests from userspace
In order to enable userspace PIC support, the userspace PIC needs to
be able to inject local interrupts even when the APICs are in the
kernel.
KVM_INTERRUPT now supports sending local interrupts to an APIC when
APICs are in the kernel.
The ready_for_interrupt_request flag is now only set when the CPU/APIC
will immediately accept and inject an interrupt (i.e. APIC has not
masked the PIC).
When the PIC wishes to initiate an INTA cycle with, say, CPU0, it
kicks CPU0 out of the guest, and renedezvous with CPU0 once it arrives
in userspace.
When the CPU/APIC unmasks the PIC, a KVM_EXIT_IRQ_WINDOW_OPEN is
triggered, so that userspace has a chance to inject a PIC interrupt
if it had been pending.
Overall, this design can lead to a small number of spurious userspace
renedezvous. In particular, whenever the PIC transistions from low to
high while it is masked and whenever the PIC becomes unmasked while
it is low.
Note: this does not buffer more than one local interrupt in the
kernel, so the VMM needs to enter the guest in order to complete
interrupt injection before injecting an additional interrupt.
Compiles for x86.
Can pass the KVM Unit Tests.
Signed-off-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-30 17:27:16 +08:00
|
|
|
return -1;
|
2020-11-27 15:53:52 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
if (!lapic_in_kernel(v))
|
|
|
|
return v->arch.interrupt.nr;
|
|
|
|
|
|
|
|
if (irqchip_split(v->kvm)) {
|
|
|
|
int vector = v->arch.pending_external_vector;
|
|
|
|
|
|
|
|
v->arch.pending_external_vector = -1;
|
|
|
|
return vector;
|
|
|
|
} else
|
|
|
|
return kvm_pic_read_irq(v->kvm); /* PIC */
|
2013-01-25 10:18:51 +08:00
|
|
|
}
|
|
|
|
|
2007-07-06 17:20:49 +08:00
|
|
|
/*
|
|
|
|
* Read pending interrupt vector and intack.
|
|
|
|
*/
|
|
|
|
int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
|
|
|
|
{
|
2020-11-27 15:53:52 +08:00
|
|
|
int vector = kvm_cpu_get_extint(v);
|
KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use
After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set. With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:
Call Trace:
[<ffffffff81493563>] dump_stack+0x49/0x5e
[<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
[<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
[<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
[<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
[<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
[<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
[<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
[<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
[<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]
To fix this, we cannot rely on the processor's virtual interrupt delivery,
because "acknowledge interrupt on exit" must only update the virtual
ISR/PPR/IRR registers (and SVI, which is just a cache of the virtual ISR)
but it should not deliver the interrupt through the IDT. Thus, KVM has
to deliver the interrupt "by hand", similar to the treatment of EOI in
commit fc57ac2c9ca8 (KVM: lapic: sync highest ISR to hardware apic on
EOI, 2014-05-14).
The patch modifies kvm_cpu_get_interrupt to always acknowledge an
interrupt; there are only two callers, and the other is not affected
because it is never reached with kvm_apic_vid_enabled() == true. Then it
modifies apic_set_isr and apic_clear_irr to update SVI and RVI in addition
to the registers.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
Tested-by: Felipe Reyes <freyes@suse.com>
Fixes: 77b0f5d67ff2781f36831cba79674c3e97bd7acf
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-05 12:42:24 +08:00
|
|
|
if (vector != -1)
|
2013-01-25 10:18:51 +08:00
|
|
|
return vector; /* PIC */
|
2012-12-10 20:05:55 +08:00
|
|
|
|
|
|
|
return kvm_get_apic_interrupt(v); /* APIC */
|
2007-07-06 17:20:49 +08:00
|
|
|
}
|
2014-04-20 06:17:45 +08:00
|
|
|
EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
|
2007-09-12 15:58:04 +08:00
|
|
|
|
2007-09-03 21:56:58 +08:00
|
|
|
void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
|
|
|
|
{
|
2016-01-08 20:41:16 +08:00
|
|
|
if (lapic_in_kernel(vcpu))
|
|
|
|
kvm_inject_apic_timer_irqs(vcpu);
|
2007-09-03 21:56:58 +08:00
|
|
|
}
|
|
|
|
EXPORT_SYMBOL_GPL(kvm_inject_pending_timer_irqs);
|
|
|
|
|
2008-05-27 23:10:20 +08:00
|
|
|
void __kvm_migrate_timers(struct kvm_vcpu *vcpu)
|
|
|
|
{
|
|
|
|
__kvm_migrate_apic_timer(vcpu);
|
|
|
|
__kvm_migrate_pit_timer(vcpu);
|
2020-05-09 04:36:43 +08:00
|
|
|
if (kvm_x86_ops.migrate_timers)
|
|
|
|
kvm_x86_ops.migrate_timers(vcpu);
|
2008-05-27 23:10:20 +08:00
|
|
|
}
|
2019-05-05 16:56:42 +08:00
|
|
|
|
|
|
|
bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args)
|
|
|
|
{
|
|
|
|
bool resample = args->flags & KVM_IRQFD_FLAG_RESAMPLE;
|
|
|
|
|
|
|
|
return resample ? irqchip_kernel(kvm) : irqchip_in_kernel(kvm);
|
|
|
|
}
|