linux/arch/x86/kernel
Mike Christie f9010dbdce fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps or ps a would not incorrectly detect the vhost task as another
process.  2. kthreads disabled freeze by setting PF_NOFREEZE, but
vhost tasks's didn't disable or add support for them.

To fix both bugs, this switches the vhost task to be thread in the
process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
SIGKILL/STOP support is required because CLONE_THREAD requires
CLONE_SIGHAND which requires those 2 signals to be supported.

This is a modified version of the patch written by Mike Christie
<michael.christie@oracle.com> which was a modified version of patch
originally written by Linus.

Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
Including ignoring signals, setting up the register state, and having
get_signal return instead of calling do_group_exit.

Tidied up the vhost_task abstraction so that the definition of
vhost_task only needs to be visible inside of vhost_task.c.  Making
it easier to review the code and tell what needs to be done where.
As part of this the main loop has been moved from vhost_worker into
vhost_task_fn.  vhost_worker now returns true if work was done.

The main loop has been updated to call get_signal which handles
SIGSTOP, freezing, and collects the message that tells the thread to
exit as part of process exit.  This collection clears
__fatal_signal_pending.  This collection is not guaranteed to
clear signal_pending() so clear that explicitly so the schedule()
sleeps.

For now the vhost thread continues to exist and run work until the
last file descriptor is closed and the release function is called as
part of freeing struct file.  To avoid hangs in the coredump
rendezvous and when killing threads in a multi-threaded exec.  The
coredump code and de_thread have been modified to ignore vhost threads.

Remvoing the special case for exec appears to require teaching
vhost_dev_flush how to directly complete transactions in case
the vhost thread is no longer running.

Removing the special case for coredump rendezvous requires either the
above fix needed for exec or moving the coredump rendezvous into
get_signal.

Fixes: 6e890c5d50 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Co-developed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-01 17:15:33 -04:00
..
acpi x86 APIC updates: 2023-04-25 11:39:45 -07:00
apic x86 APIC updates: 2023-04-25 11:39:45 -07:00
cpu x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms 2023-05-25 10:48:42 -07:00
fpu fork, vhost: Use CLONE_THREAD to fix freezer/ps regression 2023-06-01 17:15:33 -04:00
kprobes probes updates for 6.3: 2023-02-23 13:03:08 -08:00
.gitignore
alternative.c x86/alternatives: Teach text_poke_bp() to patch Jcc.d32 instructions 2023-01-31 15:05:31 +01:00
amd_gart_64.c x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros 2022-12-15 10:37:27 -08:00
amd_nb.c x86/amd_nb: Add PCI ID for family 19h model 78h 2023-05-08 11:25:19 +02:00
aperture_64.c x86: Fix various duplicate-word comment typos 2022-08-15 19:17:52 +02:00
apm_32.c efi: x86: Wire up IBT annotation in memory attributes table 2023-02-09 19:30:54 +01:00
asm-offsets_32.c
asm-offsets_64.c x86: Fixup asm-offsets duplicate 2022-10-17 16:41:06 +02:00
asm-offsets.c x86/smpboot: Remove initial_stack on 64-bit 2023-03-21 13:35:53 +01:00
audit_64.c
bootflag.c
callthunks.c module: replace module_layout with module_memory 2023-03-09 12:55:15 -08:00
cfi.c x86: Add support for CONFIG_CFI_CLANG 2022-09-26 10:13:16 -07:00
check.c
cpuid.c driver core: class: remove module * from class_create() 2023-03-17 15:16:33 +01:00
crash_core_32.c
crash_core_64.c
crash_dump_32.c vmcore: convert copy_oldmem_page() to take an iov_iter 2022-04-29 14:37:59 -07:00
crash_dump_64.c use less confusing names for iov_iter direction initializers 2022-11-25 13:01:55 -05:00
crash.c x86/crash: Disable virt in core NMI crash handler to avoid double shootdown 2023-01-24 10:05:21 -08:00
devicetree.c x86/of: Add support for boot time interrupt delivery mode configuration 2022-12-02 14:57:14 +01:00
doublefault_32.c
dumpstack_32.c x86/percpu: Move irq_stack variables next to current_task 2022-10-17 16:41:05 +02:00
dumpstack_64.c x86/percpu: Move irq_stack variables next to current_task 2022-10-17 16:41:05 +02:00
dumpstack.c x86/show_trace_log_lvl: Ensure stack pointer is aligned, again 2023-05-16 06:31:04 -07:00
e820.c x86/setup: Move duplicate boot_cpu_data definition out of the ifdeffery 2023-01-11 12:45:16 +01:00
early_printk.c x86/earlyprintk: Clean up pciserial 2022-08-29 12:19:25 +02:00
early-quirks.c drm/i915/rpl-p: Add PCI IDs 2022-04-19 17:14:09 -07:00
ebda.c
eisa.c
espfix_64.c x86/espfix: Use get_random_long() rather than archrandom 2022-10-31 20:12:50 +01:00
ftrace_32.S ftrace: selftest: remove broken trace_direct_tramp 2023-03-21 13:59:29 -04:00
ftrace_64.S Objtool changes for v6.4: 2023-04-28 14:02:54 -07:00
ftrace.c New Feature: 2022-12-17 14:06:53 -06:00
head32.c x86/head: Mark *_start_kernel() __noreturn 2023-04-14 17:31:24 +02:00
head64.c x86/head: Mark *_start_kernel() __noreturn 2023-04-14 17:31:24 +02:00
head_32.S x86/asm/32: Remove setup_once() 2022-12-02 14:06:34 +01:00
head_64.S Objtool changes for v6.4: 2023-04-28 14:02:54 -07:00
hpet.c clocksource: Verify HPET and PMTMR when TSC unverified 2023-02-02 14:23:02 -08:00
hw_breakpoint.c x86/amd: Cache debug register values in percpu variables 2023-01-31 20:09:26 +01:00
i8237.c
i8253.c
i8259.c x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL 2023-01-16 17:24:56 +01:00
idt.c x86/traps: Add #VE support for TDX guest 2022-04-07 08:27:51 -07:00
io_delay.c
ioport.c
irq_32.c x86/percpu: Move irq_stack variables next to current_task 2022-10-17 16:41:05 +02:00
irq_64.c x86/percpu: Move irq_stack variables next to current_task 2022-10-17 16:41:05 +02:00
irq_work.c
irq.c
irqflags.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
irqinit.c x86/i8259: Mark legacy PIC interrupts with IRQ_LEVEL 2023-01-16 17:24:56 +01:00
itmt.c x86: Simplify one-level sysctl registration for itmt_kern_table 2023-03-22 11:47:21 -07:00
jailhouse.c
jump_label.c jump_label: make initial NOP patching the special case 2022-06-24 09:48:55 +02:00
kdebugfs.c x86/boot: Fix memremap of setup_indirect structures 2022-03-09 12:49:44 +01:00
kexec-bzimage64.c docs: move x86 documentation into Documentation/arch/ 2023-03-30 12:58:51 -06:00
kgdb.c
ksysfs.c x86/boot: Fix memremap of setup_indirect structures 2022-03-09 12:49:44 +01:00
kvm.c ARM64: 2022-12-15 11:12:21 -08:00
kvmclock.c sched/clock/x86: Mark sched_clock() noinstr 2023-01-31 15:01:47 +01:00
ldt.c
machine_kexec_32.c
machine_kexec_64.c x86/kexec: remove unnecessary arch_kexec_kernel_image_load() 2023-04-08 13:45:38 -07:00
Makefile rethook, fprobe: do not trace rethook related functions 2023-05-18 07:08:01 +09:00
mmconf-fam10h_64.c
module.c module: replace module_layout with module_memory 2023-03-09 12:55:15 -08:00
mpparse.c
msr.c driver core: class: remove module * from class_create() 2023-03-17 15:16:33 +01:00
nmi_selftest.c
nmi.c ARM: 2023-02-25 11:30:21 -08:00
paravirt-spinlocks.c
paravirt.c x86/paravirt: Convert simple paravirt functions to asm 2023-03-17 13:29:47 +01:00
pci-dma.c docs: move x86 documentation into Documentation/arch/ 2023-03-30 12:58:51 -06:00
pcspeaker.c
perf_regs.c
platform-quirks.c
pmem.c x86/pmem: Fix platform-device leak in error path 2022-06-20 18:01:16 +02:00
probe_roms.c x86/kernel: Validate ROM memory before accessing when SEV-SNP is active 2022-04-06 13:23:09 +02:00
process_32.c x86/resctl: fix scheduler confusion with 'current' 2023-03-08 11:48:11 -08:00
process_64.c IOMMU Updates for Linux 6.4 2023-04-30 13:00:38 -07:00
process.c Objtool changes for v6.4: 2023-04-28 14:02:54 -07:00
process.h x86: Snapshot thread flags 2021-12-01 00:06:43 +01:00
ptrace.c x86: Improve formatting of user_regset arrays 2022-11-01 15:36:52 -07:00
pvclock.c sched/clock/x86: Mark sched_clock() noinstr 2023-01-31 15:01:47 +01:00
quirks.c
reboot_fixups_32.c
reboot.c cpu: Mark nmi_panic_self_stop() __noreturn 2023-04-14 17:31:26 +02:00
relocate_kernel_32.S x86/kexec: Disable RET on kexec 2022-07-09 13:12:32 +02:00
relocate_kernel_64.S x86,objtool: Split UNWIND_HINT_EMPTY in two 2023-03-23 23:18:58 +01:00
resource.c x86/PCI: Tidy E820 removal messages 2022-12-10 10:33:11 -06:00
rethook.c x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs 2022-03-28 19:38:51 -07:00
rtc.c x86/rtc: Simplify PNP ids check 2023-01-06 04:22:34 +01:00
setup_percpu.c - Add the call depth tracking mitigation for Retbleed which has 2022-12-14 15:03:00 -08:00
setup.c x86/setup: Move duplicate boot_cpu_data definition out of the ifdeffery 2023-01-11 12:45:16 +01:00
sev_verify_cbit.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
sev-shared.c Revert "x86/sev: Expose sev_es_ghcb_hv_call() for use by HyperV" 2022-07-27 18:09:13 +02:00
sev.c x86/sev: Change snp_guest_issue_request()'s fw_err argument 2023-03-21 15:43:19 +01:00
signal_32.c - Cache the AMD debug registers in per-CPU variables to avoid MSR writes 2023-02-21 14:51:40 -08:00
signal_64.c x86/signal/compat: Move sigaction_compat_abi() to signal_64.c 2023-01-06 04:16:02 +01:00
signal.c x86/signal: Fix the value returned by strict_sas_size() 2023-01-15 09:54:27 +01:00
smp.c x86/reboot: Disable SVM, not just VMX, when stopping CPUs 2023-01-24 10:05:22 -08:00
smpboot.c Objtool changes for v6.4: 2023-04-28 14:02:54 -07:00
stacktrace.c x86: remove __range_not_ok() 2022-02-25 09:36:05 +01:00
static_call.c x86/static_call: Add support for Jcc tail-calls 2023-01-31 15:05:31 +01:00
step.c ptrace: Reimplement PTRACE_KILL by always sending SIGKILL 2022-05-11 14:34:28 -05:00
sys_ia32.c
sys_x86_64.c x86/mm: Cleanup the control_va_addr_alignment() __setup handler 2022-05-04 18:20:42 +02:00
tboot.c mm: remove rb tree. 2022-09-26 19:46:16 -07:00
time.c
tls.c x86/gsseg: Move load_gs_index() to its own new header file 2023-01-12 13:06:36 +01:00
tls.h
topology.c x86/cpu: Switch to cpu_feature_enabled() for X86_FEATURE_XENPV 2022-11-22 16:18:19 +01:00
trace_clock.c
trace.c
tracepoint.c x86/traceponit: Fix comment about irq vector tracepoints 2022-05-26 22:03:52 -04:00
traps.c IOMMU Updates for Linux 6.4 2023-04-30 13:00:38 -07:00
tsc_msr.c
tsc_sync.c x86/tsc: Add a timer to make sure TSC_adjust is always checked 2021-12-02 00:40:35 +01:00
tsc.c Updates for timekeeping, timers and clockevent/source drivers: 2023-02-21 09:45:13 -08:00
umip.c
unwind_frame.c x86: kmsan: don't instrument stack walking functions 2022-10-03 14:03:25 -07:00
unwind_guess.c
unwind_orc.c x86,objtool: Split UNWIND_HINT_EMPTY in two 2023-03-23 23:18:58 +01:00
uprobes.c uprobes/x86: Allow to probe a NOP instruction with 0x66 prefix 2022-12-05 11:55:18 +01:00
verify_cpu.S x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
vm86_32.c x86/32: Remove lazy GS macros 2022-04-14 14:09:43 +02:00
vmlinux.lds.S objtool/idle: Validate __cpuidle code as noinstr 2023-01-13 11:48:15 +01:00
vsmp_64.c
x86_init.c - Add the necessary glue so that the kernel can run as a confidential 2023-04-25 10:48:08 -07:00