KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
|
|
|
#include <linux/linkage.h>
|
|
|
|
#include <asm/asm.h>
|
2022-09-15 19:11:27 +08:00
|
|
|
#include <asm/asm-offsets.h>
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
#include <asm/bitsperlong.h>
|
|
|
|
#include <asm/kvm_vcpu_regs.h>
|
2020-04-13 15:17:58 +08:00
|
|
|
#include <asm/nospec-branch.h>
|
2022-10-01 02:14:44 +08:00
|
|
|
#include "kvm-asm-offsets.h"
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
|
|
|
|
#define WORD_SIZE (BITS_PER_LONG / 8)
|
|
|
|
|
|
|
|
/* Intentionally omit RAX as it's context switched by hardware */
|
2022-10-01 02:14:44 +08:00
|
|
|
#define VCPU_RCX (SVM_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
|
|
|
|
#define VCPU_RDX (SVM_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
|
|
|
|
#define VCPU_RBX (SVM_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
/* Intentionally omit RSP as it's context switched by hardware */
|
2022-10-01 02:14:44 +08:00
|
|
|
#define VCPU_RBP (SVM_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
|
|
|
|
#define VCPU_RSI (SVM_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
|
|
|
|
#define VCPU_RDI (SVM_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
|
|
|
|
#ifdef CONFIG_X86_64
|
2022-10-01 02:14:44 +08:00
|
|
|
#define VCPU_R8 (SVM_vcpu_arch_regs + __VCPU_REGS_R8 * WORD_SIZE)
|
|
|
|
#define VCPU_R9 (SVM_vcpu_arch_regs + __VCPU_REGS_R9 * WORD_SIZE)
|
|
|
|
#define VCPU_R10 (SVM_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
|
|
|
|
#define VCPU_R11 (SVM_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
|
|
|
|
#define VCPU_R12 (SVM_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
|
|
|
|
#define VCPU_R13 (SVM_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
|
|
|
|
#define VCPU_R14 (SVM_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
|
|
|
|
#define VCPU_R15 (SVM_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
#endif
|
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
#define SVM_vmcb01_pa (SVM_vmcb01 + KVM_VMCB_pa)
|
|
|
|
|
2020-07-09 03:51:58 +08:00
|
|
|
.section .noinstr.text, "ax"
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
.macro RESTORE_GUEST_SPEC_CTRL
|
|
|
|
/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
|
|
|
|
ALTERNATIVE_2 "", \
|
|
|
|
"jmp 800f", X86_FEATURE_MSR_SPEC_CTRL, \
|
|
|
|
"", X86_FEATURE_V_SPEC_CTRL
|
|
|
|
801:
|
|
|
|
.endm
|
|
|
|
.macro RESTORE_GUEST_SPEC_CTRL_BODY
|
|
|
|
800:
|
|
|
|
/*
|
|
|
|
* SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
|
|
|
|
* host's, write the MSR. This is kept out-of-line so that the common
|
|
|
|
* case does not have to jump.
|
|
|
|
*
|
|
|
|
* IMPORTANT: To avoid RSB underflow attacks and any other nastiness,
|
|
|
|
* there must not be any returns or indirect branches between this code
|
|
|
|
* and vmentry.
|
|
|
|
*/
|
|
|
|
movl SVM_spec_ctrl(%_ASM_DI), %eax
|
|
|
|
cmp PER_CPU_VAR(x86_spec_ctrl_current), %eax
|
|
|
|
je 801b
|
|
|
|
mov $MSR_IA32_SPEC_CTRL, %ecx
|
|
|
|
xor %edx, %edx
|
|
|
|
wrmsr
|
|
|
|
jmp 801b
|
|
|
|
.endm
|
|
|
|
|
|
|
|
.macro RESTORE_HOST_SPEC_CTRL
|
|
|
|
/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
|
|
|
|
ALTERNATIVE_2 "", \
|
|
|
|
"jmp 900f", X86_FEATURE_MSR_SPEC_CTRL, \
|
|
|
|
"", X86_FEATURE_V_SPEC_CTRL
|
|
|
|
901:
|
|
|
|
.endm
|
|
|
|
.macro RESTORE_HOST_SPEC_CTRL_BODY
|
|
|
|
900:
|
|
|
|
/* Same for after vmexit. */
|
|
|
|
mov $MSR_IA32_SPEC_CTRL, %ecx
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Load the value that the guest had written into MSR_IA32_SPEC_CTRL,
|
|
|
|
* if it was not intercepted during guest execution.
|
|
|
|
*/
|
|
|
|
cmpb $0, (%_ASM_SP)
|
|
|
|
jnz 998f
|
|
|
|
rdmsr
|
|
|
|
movl %eax, SVM_spec_ctrl(%_ASM_DI)
|
|
|
|
998:
|
|
|
|
|
|
|
|
/* Now restore the host value of the MSR if different from the guest's. */
|
|
|
|
movl PER_CPU_VAR(x86_spec_ctrl_current), %eax
|
|
|
|
cmp SVM_spec_ctrl(%_ASM_DI), %eax
|
|
|
|
je 901b
|
|
|
|
xor %edx, %edx
|
|
|
|
wrmsr
|
|
|
|
jmp 901b
|
|
|
|
.endm
|
|
|
|
|
|
|
|
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
/**
|
|
|
|
* __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
|
2022-10-01 02:14:44 +08:00
|
|
|
* @svm: struct vcpu_svm *
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
* @spec_ctrl_intercepted: bool
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
*/
|
|
|
|
SYM_FUNC_START(__svm_vcpu_run)
|
|
|
|
push %_ASM_BP
|
|
|
|
#ifdef CONFIG_X86_64
|
|
|
|
push %r15
|
|
|
|
push %r14
|
|
|
|
push %r13
|
|
|
|
push %r12
|
|
|
|
#else
|
|
|
|
push %edi
|
|
|
|
push %esi
|
|
|
|
#endif
|
|
|
|
push %_ASM_BX
|
|
|
|
|
2022-11-07 16:49:59 +08:00
|
|
|
/*
|
|
|
|
* Save variables needed after vmexit on the stack, in inverse
|
|
|
|
* order compared to when they are needed.
|
|
|
|
*/
|
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* Accessed directly from the stack in RESTORE_HOST_SPEC_CTRL. */
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
push %_ASM_ARG2
|
|
|
|
|
2022-11-07 16:49:59 +08:00
|
|
|
/* Needed to restore access to percpu variables. */
|
|
|
|
__ASM_SIZE(push) PER_CPU_VAR(svm_data + SD_save_area_pa)
|
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* Finally save @svm. */
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
push %_ASM_ARG1
|
|
|
|
|
2022-11-07 17:17:29 +08:00
|
|
|
.ifnc _ASM_ARG1, _ASM_DI
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/*
|
|
|
|
* Stash @svm in RDI early. On 32-bit, arguments are in RAX, RCX
|
|
|
|
* and RDX which are clobbered by RESTORE_GUEST_SPEC_CTRL.
|
|
|
|
*/
|
2022-11-07 17:17:29 +08:00
|
|
|
mov %_ASM_ARG1, %_ASM_DI
|
|
|
|
.endif
|
2022-10-29 05:30:07 +08:00
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* Clobbers RAX, RCX, RDX. */
|
|
|
|
RESTORE_GUEST_SPEC_CTRL
|
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
/*
|
|
|
|
* Use a single vmcb (vmcb01 because it's always valid) for
|
|
|
|
* context switching guest state via VMLOAD/VMSAVE, that way
|
|
|
|
* the state doesn't need to be copied between vmcb01 and
|
|
|
|
* vmcb02 when switching vmcbs for nested virtualization.
|
|
|
|
*/
|
|
|
|
mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
|
|
|
|
1: vmload %_ASM_AX
|
|
|
|
2:
|
|
|
|
|
2022-11-07 17:17:29 +08:00
|
|
|
/* Get svm->current_vmcb->pa into RAX. */
|
|
|
|
mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
|
|
|
|
mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
|
|
|
|
/* Load guest registers. */
|
2022-10-29 05:30:07 +08:00
|
|
|
mov VCPU_RCX(%_ASM_DI), %_ASM_CX
|
|
|
|
mov VCPU_RDX(%_ASM_DI), %_ASM_DX
|
|
|
|
mov VCPU_RBX(%_ASM_DI), %_ASM_BX
|
|
|
|
mov VCPU_RBP(%_ASM_DI), %_ASM_BP
|
|
|
|
mov VCPU_RSI(%_ASM_DI), %_ASM_SI
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
#ifdef CONFIG_X86_64
|
2022-10-29 05:30:07 +08:00
|
|
|
mov VCPU_R8 (%_ASM_DI), %r8
|
|
|
|
mov VCPU_R9 (%_ASM_DI), %r9
|
|
|
|
mov VCPU_R10(%_ASM_DI), %r10
|
|
|
|
mov VCPU_R11(%_ASM_DI), %r11
|
|
|
|
mov VCPU_R12(%_ASM_DI), %r12
|
|
|
|
mov VCPU_R13(%_ASM_DI), %r13
|
|
|
|
mov VCPU_R14(%_ASM_DI), %r14
|
|
|
|
mov VCPU_R15(%_ASM_DI), %r15
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
#endif
|
2022-10-29 05:30:07 +08:00
|
|
|
mov VCPU_RDI(%_ASM_DI), %_ASM_DI
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
|
|
|
|
/* Enter guest mode */
|
2020-04-13 15:17:58 +08:00
|
|
|
sti
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
3: vmrun %_ASM_AX
|
|
|
|
4:
|
|
|
|
cli
|
2020-04-13 15:17:58 +08:00
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
/* Pop @svm to RAX while it's the only available register. */
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
pop %_ASM_AX
|
|
|
|
|
|
|
|
/* Save all guest registers. */
|
|
|
|
mov %_ASM_CX, VCPU_RCX(%_ASM_AX)
|
|
|
|
mov %_ASM_DX, VCPU_RDX(%_ASM_AX)
|
|
|
|
mov %_ASM_BX, VCPU_RBX(%_ASM_AX)
|
|
|
|
mov %_ASM_BP, VCPU_RBP(%_ASM_AX)
|
|
|
|
mov %_ASM_SI, VCPU_RSI(%_ASM_AX)
|
|
|
|
mov %_ASM_DI, VCPU_RDI(%_ASM_AX)
|
|
|
|
#ifdef CONFIG_X86_64
|
|
|
|
mov %r8, VCPU_R8 (%_ASM_AX)
|
|
|
|
mov %r9, VCPU_R9 (%_ASM_AX)
|
|
|
|
mov %r10, VCPU_R10(%_ASM_AX)
|
|
|
|
mov %r11, VCPU_R11(%_ASM_AX)
|
|
|
|
mov %r12, VCPU_R12(%_ASM_AX)
|
|
|
|
mov %r13, VCPU_R13(%_ASM_AX)
|
|
|
|
mov %r14, VCPU_R14(%_ASM_AX)
|
|
|
|
mov %r15, VCPU_R15(%_ASM_AX)
|
|
|
|
#endif
|
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
/* @svm can stay in RDI from now on. */
|
|
|
|
mov %_ASM_AX, %_ASM_DI
|
|
|
|
|
|
|
|
mov SVM_vmcb01_pa(%_ASM_DI), %_ASM_AX
|
|
|
|
5: vmsave %_ASM_AX
|
|
|
|
6:
|
|
|
|
|
2022-11-07 16:49:59 +08:00
|
|
|
/* Restores GSBASE among other things, allowing access to percpu data. */
|
|
|
|
pop %_ASM_AX
|
|
|
|
7: vmload %_ASM_AX
|
|
|
|
8:
|
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
#ifdef CONFIG_RETPOLINE
|
|
|
|
/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
|
|
|
|
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
|
|
|
|
#endif
|
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* Clobbers RAX, RCX, RDX. */
|
|
|
|
RESTORE_HOST_SPEC_CTRL
|
|
|
|
|
2022-06-15 05:15:48 +08:00
|
|
|
/*
|
|
|
|
* Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
|
|
|
|
* untrained as soon as we exit the VM and are back to the
|
|
|
|
* kernel. This should be done before re-enabling interrupts
|
|
|
|
* because interrupt handlers won't sanitize 'ret' if the return is
|
|
|
|
* from the kernel.
|
|
|
|
*/
|
2023-08-14 19:44:35 +08:00
|
|
|
UNTRAIN_RET_VM
|
2023-07-07 19:53:41 +08:00
|
|
|
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
/*
|
|
|
|
* Clear all general purpose registers except RSP and RAX to prevent
|
|
|
|
* speculative use of the guest's values, even those that are reloaded
|
|
|
|
* via the stack. In theory, an L1 cache miss when restoring registers
|
|
|
|
* could lead to speculative execution with the guest's values.
|
|
|
|
* Zeroing XORs are dirt cheap, i.e. the extra paranoia is essentially
|
|
|
|
* free. RSP and RAX are exempt as they are restored by hardware
|
|
|
|
* during VM-Exit.
|
|
|
|
*/
|
|
|
|
xor %ecx, %ecx
|
|
|
|
xor %edx, %edx
|
|
|
|
xor %ebx, %ebx
|
|
|
|
xor %ebp, %ebp
|
|
|
|
xor %esi, %esi
|
|
|
|
xor %edi, %edi
|
|
|
|
#ifdef CONFIG_X86_64
|
|
|
|
xor %r8d, %r8d
|
|
|
|
xor %r9d, %r9d
|
|
|
|
xor %r10d, %r10d
|
|
|
|
xor %r11d, %r11d
|
|
|
|
xor %r12d, %r12d
|
|
|
|
xor %r13d, %r13d
|
|
|
|
xor %r14d, %r14d
|
|
|
|
xor %r15d, %r15d
|
|
|
|
#endif
|
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* "Pop" @spec_ctrl_intercepted. */
|
|
|
|
pop %_ASM_BX
|
|
|
|
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
pop %_ASM_BX
|
|
|
|
|
|
|
|
#ifdef CONFIG_X86_64
|
|
|
|
pop %r12
|
|
|
|
pop %r13
|
|
|
|
pop %r14
|
|
|
|
pop %r15
|
|
|
|
#else
|
|
|
|
pop %esi
|
|
|
|
pop %edi
|
|
|
|
#endif
|
|
|
|
pop %_ASM_BP
|
2021-12-04 21:43:40 +08:00
|
|
|
RET
|
2021-02-26 20:56:21 +08:00
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
RESTORE_GUEST_SPEC_CTRL_BODY
|
|
|
|
RESTORE_HOST_SPEC_CTRL_BODY
|
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
10: cmpb $0, kvm_rebooting
|
2021-02-26 20:56:21 +08:00
|
|
|
jne 2b
|
|
|
|
ud2
|
2022-11-07 18:14:27 +08:00
|
|
|
30: cmpb $0, kvm_rebooting
|
|
|
|
jne 4b
|
|
|
|
ud2
|
|
|
|
50: cmpb $0, kvm_rebooting
|
|
|
|
jne 6b
|
|
|
|
ud2
|
2022-11-07 16:49:59 +08:00
|
|
|
70: cmpb $0, kvm_rebooting
|
|
|
|
jne 8b
|
|
|
|
ud2
|
2021-02-26 20:56:21 +08:00
|
|
|
|
2022-11-07 18:14:27 +08:00
|
|
|
_ASM_EXTABLE(1b, 10b)
|
|
|
|
_ASM_EXTABLE(3b, 30b)
|
|
|
|
_ASM_EXTABLE(5b, 50b)
|
2022-11-07 16:49:59 +08:00
|
|
|
_ASM_EXTABLE(7b, 70b)
|
2021-02-26 20:56:21 +08:00
|
|
|
|
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 21:02:13 +08:00
|
|
|
SYM_FUNC_END(__svm_vcpu_run)
|
2020-12-11 01:10:08 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* __svm_sev_es_vcpu_run - Run a SEV-ES vCPU via a transition to SVM guest mode
|
2022-11-07 17:17:29 +08:00
|
|
|
* @svm: struct vcpu_svm *
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
* @spec_ctrl_intercepted: bool
|
2020-12-11 01:10:08 +08:00
|
|
|
*/
|
|
|
|
SYM_FUNC_START(__svm_sev_es_vcpu_run)
|
|
|
|
push %_ASM_BP
|
|
|
|
#ifdef CONFIG_X86_64
|
|
|
|
push %r15
|
|
|
|
push %r14
|
|
|
|
push %r13
|
|
|
|
push %r12
|
|
|
|
#else
|
|
|
|
push %edi
|
|
|
|
push %esi
|
|
|
|
#endif
|
|
|
|
push %_ASM_BX
|
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/*
|
|
|
|
* Save variables needed after vmexit on the stack, in inverse
|
|
|
|
* order compared to when they are needed.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* Accessed directly from the stack in RESTORE_HOST_SPEC_CTRL. */
|
|
|
|
push %_ASM_ARG2
|
|
|
|
|
|
|
|
/* Save @svm. */
|
|
|
|
push %_ASM_ARG1
|
|
|
|
|
|
|
|
.ifnc _ASM_ARG1, _ASM_DI
|
|
|
|
/*
|
|
|
|
* Stash @svm in RDI early. On 32-bit, arguments are in RAX, RCX
|
|
|
|
* and RDX which are clobbered by RESTORE_GUEST_SPEC_CTRL.
|
|
|
|
*/
|
|
|
|
mov %_ASM_ARG1, %_ASM_DI
|
|
|
|
.endif
|
|
|
|
|
|
|
|
/* Clobbers RAX, RCX, RDX. */
|
|
|
|
RESTORE_GUEST_SPEC_CTRL
|
|
|
|
|
2022-11-07 17:17:29 +08:00
|
|
|
/* Get svm->current_vmcb->pa into RAX. */
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
mov SVM_current_vmcb(%_ASM_DI), %_ASM_AX
|
2022-11-07 17:17:29 +08:00
|
|
|
mov KVM_VMCB_pa(%_ASM_AX), %_ASM_AX
|
2021-02-26 20:56:21 +08:00
|
|
|
|
|
|
|
/* Enter guest mode */
|
2020-12-11 01:10:08 +08:00
|
|
|
sti
|
|
|
|
|
|
|
|
1: vmrun %_ASM_AX
|
|
|
|
|
2021-02-26 20:56:21 +08:00
|
|
|
2: cli
|
2020-12-11 01:10:08 +08:00
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* Pop @svm to RDI, guest registers have been saved already. */
|
|
|
|
pop %_ASM_DI
|
|
|
|
|
2020-12-11 01:10:08 +08:00
|
|
|
#ifdef CONFIG_RETPOLINE
|
|
|
|
/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
|
|
|
|
FILL_RETURN_BUFFER %_ASM_AX, RSB_CLEAR_LOOPS, X86_FEATURE_RETPOLINE
|
|
|
|
#endif
|
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* Clobbers RAX, RCX, RDX. */
|
|
|
|
RESTORE_HOST_SPEC_CTRL
|
|
|
|
|
2022-06-15 05:15:48 +08:00
|
|
|
/*
|
|
|
|
* Mitigate RETBleed for AMD/Hygon Zen uarch. RET should be
|
|
|
|
* untrained as soon as we exit the VM and are back to the
|
|
|
|
* kernel. This should be done before re-enabling interrupts
|
|
|
|
* because interrupt handlers won't sanitize RET if the return is
|
|
|
|
* from the kernel.
|
|
|
|
*/
|
2023-08-14 19:44:35 +08:00
|
|
|
UNTRAIN_RET_VM
|
2022-06-15 05:15:48 +08:00
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
/* "Pop" @spec_ctrl_intercepted. */
|
|
|
|
pop %_ASM_BX
|
|
|
|
|
2020-12-11 01:10:08 +08:00
|
|
|
pop %_ASM_BX
|
|
|
|
|
|
|
|
#ifdef CONFIG_X86_64
|
|
|
|
pop %r12
|
|
|
|
pop %r13
|
|
|
|
pop %r14
|
|
|
|
pop %r15
|
|
|
|
#else
|
|
|
|
pop %esi
|
|
|
|
pop %edi
|
|
|
|
#endif
|
|
|
|
pop %_ASM_BP
|
2021-12-04 21:43:40 +08:00
|
|
|
RET
|
2021-02-26 20:56:21 +08:00
|
|
|
|
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-01 02:24:40 +08:00
|
|
|
RESTORE_GUEST_SPEC_CTRL_BODY
|
|
|
|
RESTORE_HOST_SPEC_CTRL_BODY
|
|
|
|
|
2021-02-26 20:56:21 +08:00
|
|
|
3: cmpb $0, kvm_rebooting
|
|
|
|
jne 2b
|
|
|
|
ud2
|
|
|
|
|
|
|
|
_ASM_EXTABLE(1b, 3b)
|
|
|
|
|
2020-12-11 01:10:08 +08:00
|
|
|
SYM_FUNC_END(__svm_sev_es_vcpu_run)
|