mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-11-11 12:28:41 +08:00
Merge branch 'x86-bsp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 BSP hotplug changes from Ingo Molnar: "This tree enables CPU#0 (the boot processor) to be onlined/offlined on x86, just like any other CPU. Enabled on Intel CPUs for now. Allowing this required the identification and fixing of latent CPU#0 assumptions (such as CPU#0 initializations, etc.) in the x86 architecture code, plus the identification of barriers to BSP-offlining, such as active PIC interrupts which can only be serviced on the BSP. It's behind a default-off option, and there's a debug option that allows the automatic testing of this feature. The motivation of this feature is to allow and prepare for true CPU-hotplug hardware support: recent changes to MCE support enable us to detect a deteriorating but not yet hard-failing L1/L2 cache on a CPU that could be soft-unplugged - or a failing L3 cache on a multi-socket system. Note that true hardware hot-plug is not yet fully enabled by this, because that requires a special platform wakeup sequence to be sent to the freshly powered up CPU#0. Future patches for this are planned, once such a platform exists. Chicken and egg" * 'x86-bsp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, topology: Debug CPU0 hotplug x86/i387.c: Initialize thread xstate only on CPU0 only once x86, hotplug: Handle retrigger irq by the first available CPU x86, hotplug: The first online processor saves the MTRR state x86, hotplug: During CPU0 online, enable x2apic, set_numa_node. x86, hotplug: Wake up CPU0 via NMI instead of INIT, SIPI, SIPI x86-32, hotplug: Add start_cpu0() entry point to head_32.S x86-64, hotplug: Add start_cpu0() entry point to head_64.S kernel/cpu.c: Add comment for priority in cpu_hotplug_pm_callback x86, hotplug, suspend: Online CPU0 for suspend or hibernate x86, hotplug: Support functions for CPU0 online/offline x86, topology: Don't offline CPU0 if any PIC irq can not be migrated out of it x86, Kconfig: Add config switch for CPU0 hotplug doc: Add x86 CPU0 online/offline feature
This commit is contained in:
commit
74b8423345
@ -207,6 +207,30 @@ by making it not-removable.
|
|||||||
|
|
||||||
In such cases you will also notice that the online file is missing under cpu0.
|
In such cases you will also notice that the online file is missing under cpu0.
|
||||||
|
|
||||||
|
Q: Is CPU0 removable on X86?
|
||||||
|
A: Yes. If kernel is compiled with CONFIG_BOOTPARAM_HOTPLUG_CPU0=y, CPU0 is
|
||||||
|
removable by default. Otherwise, CPU0 is also removable by kernel option
|
||||||
|
cpu0_hotplug.
|
||||||
|
|
||||||
|
But some features depend on CPU0. Two known dependencies are:
|
||||||
|
|
||||||
|
1. Resume from hibernate/suspend depends on CPU0. Hibernate/suspend will fail if
|
||||||
|
CPU0 is offline and you need to online CPU0 before hibernate/suspend can
|
||||||
|
continue.
|
||||||
|
2. PIC interrupts also depend on CPU0. CPU0 can't be removed if a PIC interrupt
|
||||||
|
is detected.
|
||||||
|
|
||||||
|
It's said poweroff/reboot may depend on CPU0 on some machines although I haven't
|
||||||
|
seen any poweroff/reboot failure so far after CPU0 is offline on a few tested
|
||||||
|
machines.
|
||||||
|
|
||||||
|
Please let me know if you know or see any other dependencies of CPU0.
|
||||||
|
|
||||||
|
If the dependencies are under your control, you can turn on CPU0 hotplug feature
|
||||||
|
either by CONFIG_BOOTPARAM_HOTPLUG_CPU0 or by kernel parameter cpu0_hotplug.
|
||||||
|
|
||||||
|
--Fenghua Yu <fenghua.yu@intel.com>
|
||||||
|
|
||||||
Q: How do i find out if a particular CPU is not removable?
|
Q: How do i find out if a particular CPU is not removable?
|
||||||
A: Depending on the implementation, some architectures may show this by the
|
A: Depending on the implementation, some architectures may show this by the
|
||||||
absence of the "online" file. This is done if it can be determined ahead of
|
absence of the "online" file. This is done if it can be determined ahead of
|
||||||
|
@ -1984,6 +1984,20 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||||||
|
|
||||||
nox2apic [X86-64,APIC] Do not enable x2APIC mode.
|
nox2apic [X86-64,APIC] Do not enable x2APIC mode.
|
||||||
|
|
||||||
|
cpu0_hotplug [X86] Turn on CPU0 hotplug feature when
|
||||||
|
CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
|
||||||
|
Some features depend on CPU0. Known dependencies are:
|
||||||
|
1. Resume from suspend/hibernate depends on CPU0.
|
||||||
|
Suspend/hibernate will fail if CPU0 is offline and you
|
||||||
|
need to online CPU0 before suspend/hibernate.
|
||||||
|
2. PIC interrupts also depend on CPU0. CPU0 can't be
|
||||||
|
removed if a PIC interrupt is detected.
|
||||||
|
It's said poweroff/reboot may depend on CPU0 on some
|
||||||
|
machines although I haven't seen such issues so far
|
||||||
|
after CPU0 is offline on a few tested machines.
|
||||||
|
If the dependencies are under your control, you can
|
||||||
|
turn on cpu0_hotplug.
|
||||||
|
|
||||||
nptcg= [IA-64] Override max number of concurrent global TLB
|
nptcg= [IA-64] Override max number of concurrent global TLB
|
||||||
purges which is reported from either PAL_VM_SUMMARY or
|
purges which is reported from either PAL_VM_SUMMARY or
|
||||||
SAL PALO.
|
SAL PALO.
|
||||||
|
@ -1698,6 +1698,50 @@ config HOTPLUG_CPU
|
|||||||
automatically on SMP systems. )
|
automatically on SMP systems. )
|
||||||
Say N if you want to disable CPU hotplug.
|
Say N if you want to disable CPU hotplug.
|
||||||
|
|
||||||
|
config BOOTPARAM_HOTPLUG_CPU0
|
||||||
|
bool "Set default setting of cpu0_hotpluggable"
|
||||||
|
default n
|
||||||
|
depends on HOTPLUG_CPU && EXPERIMENTAL
|
||||||
|
---help---
|
||||||
|
Set whether default state of cpu0_hotpluggable is on or off.
|
||||||
|
|
||||||
|
Say Y here to enable CPU0 hotplug by default. If this switch
|
||||||
|
is turned on, there is no need to give cpu0_hotplug kernel
|
||||||
|
parameter and the CPU0 hotplug feature is enabled by default.
|
||||||
|
|
||||||
|
Please note: there are two known CPU0 dependencies if you want
|
||||||
|
to enable the CPU0 hotplug feature either by this switch or by
|
||||||
|
cpu0_hotplug kernel parameter.
|
||||||
|
|
||||||
|
First, resume from hibernate or suspend always starts from CPU0.
|
||||||
|
So hibernate and suspend are prevented if CPU0 is offline.
|
||||||
|
|
||||||
|
Second dependency is PIC interrupts always go to CPU0. CPU0 can not
|
||||||
|
offline if any interrupt can not migrate out of CPU0. There may
|
||||||
|
be other CPU0 dependencies.
|
||||||
|
|
||||||
|
Please make sure the dependencies are under your control before
|
||||||
|
you enable this feature.
|
||||||
|
|
||||||
|
Say N if you don't want to enable CPU0 hotplug feature by default.
|
||||||
|
You still can enable the CPU0 hotplug feature at boot by kernel
|
||||||
|
parameter cpu0_hotplug.
|
||||||
|
|
||||||
|
config DEBUG_HOTPLUG_CPU0
|
||||||
|
def_bool n
|
||||||
|
prompt "Debug CPU0 hotplug"
|
||||||
|
depends on HOTPLUG_CPU && EXPERIMENTAL
|
||||||
|
---help---
|
||||||
|
Enabling this option offlines CPU0 (if CPU0 can be offlined) as
|
||||||
|
soon as possible and boots up userspace with CPU0 offlined. User
|
||||||
|
can online CPU0 back after boot time.
|
||||||
|
|
||||||
|
To debug CPU0 hotplug, you need to enable CPU0 offline/online
|
||||||
|
feature by either turning on CONFIG_BOOTPARAM_HOTPLUG_CPU0 during
|
||||||
|
compilation or giving cpu0_hotplug kernel parameter at boot.
|
||||||
|
|
||||||
|
If unsure, say N.
|
||||||
|
|
||||||
config COMPAT_VDSO
|
config COMPAT_VDSO
|
||||||
def_bool y
|
def_bool y
|
||||||
prompt "Compat VDSO support"
|
prompt "Compat VDSO support"
|
||||||
|
@ -28,6 +28,10 @@ struct x86_cpu {
|
|||||||
#ifdef CONFIG_HOTPLUG_CPU
|
#ifdef CONFIG_HOTPLUG_CPU
|
||||||
extern int arch_register_cpu(int num);
|
extern int arch_register_cpu(int num);
|
||||||
extern void arch_unregister_cpu(int);
|
extern void arch_unregister_cpu(int);
|
||||||
|
extern void __cpuinit start_cpu0(void);
|
||||||
|
#ifdef CONFIG_DEBUG_HOTPLUG_CPU0
|
||||||
|
extern int _debug_hotplug_cpu(int cpu, int action);
|
||||||
|
#endif
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
DECLARE_PER_CPU(int, cpu_state);
|
DECLARE_PER_CPU(int, cpu_state);
|
||||||
|
@ -166,6 +166,7 @@ void native_send_call_func_ipi(const struct cpumask *mask);
|
|||||||
void native_send_call_func_single_ipi(int cpu);
|
void native_send_call_func_single_ipi(int cpu);
|
||||||
void x86_idle_thread_init(unsigned int cpu, struct task_struct *idle);
|
void x86_idle_thread_init(unsigned int cpu, struct task_struct *idle);
|
||||||
|
|
||||||
|
void smp_store_boot_cpu_info(void);
|
||||||
void smp_store_cpu_info(int id);
|
void smp_store_cpu_info(int id);
|
||||||
#define cpu_physical_id(cpu) per_cpu(x86_cpu_to_apicid, cpu)
|
#define cpu_physical_id(cpu) per_cpu(x86_cpu_to_apicid, cpu)
|
||||||
|
|
||||||
|
@ -2199,9 +2199,11 @@ static int ioapic_retrigger_irq(struct irq_data *data)
|
|||||||
{
|
{
|
||||||
struct irq_cfg *cfg = data->chip_data;
|
struct irq_cfg *cfg = data->chip_data;
|
||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
|
int cpu;
|
||||||
|
|
||||||
raw_spin_lock_irqsave(&vector_lock, flags);
|
raw_spin_lock_irqsave(&vector_lock, flags);
|
||||||
apic->send_IPI_mask(cpumask_of(cpumask_first(cfg->domain)), cfg->vector);
|
cpu = cpumask_first_and(cfg->domain, cpu_online_mask);
|
||||||
|
apic->send_IPI_mask(cpumask_of(cpu), cfg->vector);
|
||||||
raw_spin_unlock_irqrestore(&vector_lock, flags);
|
raw_spin_unlock_irqrestore(&vector_lock, flags);
|
||||||
|
|
||||||
return 1;
|
return 1;
|
||||||
|
@ -1237,7 +1237,7 @@ void __cpuinit cpu_init(void)
|
|||||||
oist = &per_cpu(orig_ist, cpu);
|
oist = &per_cpu(orig_ist, cpu);
|
||||||
|
|
||||||
#ifdef CONFIG_NUMA
|
#ifdef CONFIG_NUMA
|
||||||
if (cpu != 0 && this_cpu_read(numa_node) == 0 &&
|
if (this_cpu_read(numa_node) == 0 &&
|
||||||
early_cpu_to_node(cpu) != NUMA_NO_NODE)
|
early_cpu_to_node(cpu) != NUMA_NO_NODE)
|
||||||
set_numa_node(early_cpu_to_node(cpu));
|
set_numa_node(early_cpu_to_node(cpu));
|
||||||
#endif
|
#endif
|
||||||
@ -1269,8 +1269,7 @@ void __cpuinit cpu_init(void)
|
|||||||
barrier();
|
barrier();
|
||||||
|
|
||||||
x86_configure_nx();
|
x86_configure_nx();
|
||||||
if (cpu != 0)
|
enable_x2apic();
|
||||||
enable_x2apic();
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* set up and load the per-CPU TSS
|
* set up and load the per-CPU TSS
|
||||||
|
@ -695,11 +695,16 @@ void mtrr_ap_init(void)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Save current fixed-range MTRR state of the BSP
|
* Save current fixed-range MTRR state of the first cpu in cpu_online_mask.
|
||||||
*/
|
*/
|
||||||
void mtrr_save_state(void)
|
void mtrr_save_state(void)
|
||||||
{
|
{
|
||||||
smp_call_function_single(0, mtrr_save_fixed_ranges, NULL, 1);
|
int first_cpu;
|
||||||
|
|
||||||
|
get_online_cpus();
|
||||||
|
first_cpu = cpumask_first(cpu_online_mask);
|
||||||
|
smp_call_function_single(first_cpu, mtrr_save_fixed_ranges, NULL, 1);
|
||||||
|
put_online_cpus();
|
||||||
}
|
}
|
||||||
|
|
||||||
void set_mtrr_aps_delayed_init(void)
|
void set_mtrr_aps_delayed_init(void)
|
||||||
|
@ -266,6 +266,19 @@ num_subarch_entries = (. - subarch_entries) / 4
|
|||||||
jmp default_entry
|
jmp default_entry
|
||||||
#endif /* CONFIG_PARAVIRT */
|
#endif /* CONFIG_PARAVIRT */
|
||||||
|
|
||||||
|
#ifdef CONFIG_HOTPLUG_CPU
|
||||||
|
/*
|
||||||
|
* Boot CPU0 entry point. It's called from play_dead(). Everything has been set
|
||||||
|
* up already except stack. We just set up stack here. Then call
|
||||||
|
* start_secondary().
|
||||||
|
*/
|
||||||
|
ENTRY(start_cpu0)
|
||||||
|
movl stack_start, %ecx
|
||||||
|
movl %ecx, %esp
|
||||||
|
jmp *(initial_code)
|
||||||
|
ENDPROC(start_cpu0)
|
||||||
|
#endif
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Non-boot CPU entry point; entered from trampoline.S
|
* Non-boot CPU entry point; entered from trampoline.S
|
||||||
* We can't lgdt here, because lgdt itself uses a data segment, but
|
* We can't lgdt here, because lgdt itself uses a data segment, but
|
||||||
|
@ -252,6 +252,22 @@ ENTRY(secondary_startup_64)
|
|||||||
pushq %rax # target address in negative space
|
pushq %rax # target address in negative space
|
||||||
lretq
|
lretq
|
||||||
|
|
||||||
|
#ifdef CONFIG_HOTPLUG_CPU
|
||||||
|
/*
|
||||||
|
* Boot CPU0 entry point. It's called from play_dead(). Everything has been set
|
||||||
|
* up already except stack. We just set up stack here. Then call
|
||||||
|
* start_secondary().
|
||||||
|
*/
|
||||||
|
ENTRY(start_cpu0)
|
||||||
|
movq stack_start(%rip),%rsp
|
||||||
|
movq initial_code(%rip),%rax
|
||||||
|
pushq $0 # fake return address to stop unwinder
|
||||||
|
pushq $__KERNEL_CS # set correct cs
|
||||||
|
pushq %rax # target address in negative space
|
||||||
|
lretq
|
||||||
|
ENDPROC(start_cpu0)
|
||||||
|
#endif
|
||||||
|
|
||||||
/* SMP bootup changes these two */
|
/* SMP bootup changes these two */
|
||||||
__REFDATA
|
__REFDATA
|
||||||
.align 8
|
.align 8
|
||||||
|
@ -175,7 +175,11 @@ void __cpuinit fpu_init(void)
|
|||||||
cr0 |= X86_CR0_EM;
|
cr0 |= X86_CR0_EM;
|
||||||
write_cr0(cr0);
|
write_cr0(cr0);
|
||||||
|
|
||||||
if (!smp_processor_id())
|
/*
|
||||||
|
* init_thread_xstate is only called once to avoid overriding
|
||||||
|
* xstate_size during boot time or during CPU hotplug.
|
||||||
|
*/
|
||||||
|
if (xstate_size == 0)
|
||||||
init_thread_xstate();
|
init_thread_xstate();
|
||||||
|
|
||||||
mxcsr_feature_mask_init();
|
mxcsr_feature_mask_init();
|
||||||
|
@ -127,8 +127,8 @@ EXPORT_PER_CPU_SYMBOL(cpu_info);
|
|||||||
atomic_t init_deasserted;
|
atomic_t init_deasserted;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Report back to the Boot Processor.
|
* Report back to the Boot Processor during boot time or to the caller processor
|
||||||
* Running on AP.
|
* during CPU online.
|
||||||
*/
|
*/
|
||||||
static void __cpuinit smp_callin(void)
|
static void __cpuinit smp_callin(void)
|
||||||
{
|
{
|
||||||
@ -140,15 +140,17 @@ static void __cpuinit smp_callin(void)
|
|||||||
* we may get here before an INIT-deassert IPI reaches
|
* we may get here before an INIT-deassert IPI reaches
|
||||||
* our local APIC. We have to wait for the IPI or we'll
|
* our local APIC. We have to wait for the IPI or we'll
|
||||||
* lock up on an APIC access.
|
* lock up on an APIC access.
|
||||||
|
*
|
||||||
|
* Since CPU0 is not wakened up by INIT, it doesn't wait for the IPI.
|
||||||
*/
|
*/
|
||||||
if (apic->wait_for_init_deassert)
|
cpuid = smp_processor_id();
|
||||||
|
if (apic->wait_for_init_deassert && cpuid != 0)
|
||||||
apic->wait_for_init_deassert(&init_deasserted);
|
apic->wait_for_init_deassert(&init_deasserted);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* (This works even if the APIC is not enabled.)
|
* (This works even if the APIC is not enabled.)
|
||||||
*/
|
*/
|
||||||
phys_id = read_apic_id();
|
phys_id = read_apic_id();
|
||||||
cpuid = smp_processor_id();
|
|
||||||
if (cpumask_test_cpu(cpuid, cpu_callin_mask)) {
|
if (cpumask_test_cpu(cpuid, cpu_callin_mask)) {
|
||||||
panic("%s: phys CPU#%d, CPU#%d already present??\n", __func__,
|
panic("%s: phys CPU#%d, CPU#%d already present??\n", __func__,
|
||||||
phys_id, cpuid);
|
phys_id, cpuid);
|
||||||
@ -230,6 +232,8 @@ static void __cpuinit smp_callin(void)
|
|||||||
cpumask_set_cpu(cpuid, cpu_callin_mask);
|
cpumask_set_cpu(cpuid, cpu_callin_mask);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int cpu0_logical_apicid;
|
||||||
|
static int enable_start_cpu0;
|
||||||
/*
|
/*
|
||||||
* Activate a secondary processor.
|
* Activate a secondary processor.
|
||||||
*/
|
*/
|
||||||
@ -245,6 +249,8 @@ notrace static void __cpuinit start_secondary(void *unused)
|
|||||||
preempt_disable();
|
preempt_disable();
|
||||||
smp_callin();
|
smp_callin();
|
||||||
|
|
||||||
|
enable_start_cpu0 = 0;
|
||||||
|
|
||||||
#ifdef CONFIG_X86_32
|
#ifdef CONFIG_X86_32
|
||||||
/* switch away from the initial page table */
|
/* switch away from the initial page table */
|
||||||
load_cr3(swapper_pg_dir);
|
load_cr3(swapper_pg_dir);
|
||||||
@ -281,19 +287,30 @@ notrace static void __cpuinit start_secondary(void *unused)
|
|||||||
cpu_idle();
|
cpu_idle();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void __init smp_store_boot_cpu_info(void)
|
||||||
|
{
|
||||||
|
int id = 0; /* CPU 0 */
|
||||||
|
struct cpuinfo_x86 *c = &cpu_data(id);
|
||||||
|
|
||||||
|
*c = boot_cpu_data;
|
||||||
|
c->cpu_index = id;
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* The bootstrap kernel entry code has set these up. Save them for
|
* The bootstrap kernel entry code has set these up. Save them for
|
||||||
* a given CPU
|
* a given CPU
|
||||||
*/
|
*/
|
||||||
|
|
||||||
void __cpuinit smp_store_cpu_info(int id)
|
void __cpuinit smp_store_cpu_info(int id)
|
||||||
{
|
{
|
||||||
struct cpuinfo_x86 *c = &cpu_data(id);
|
struct cpuinfo_x86 *c = &cpu_data(id);
|
||||||
|
|
||||||
*c = boot_cpu_data;
|
*c = boot_cpu_data;
|
||||||
c->cpu_index = id;
|
c->cpu_index = id;
|
||||||
if (id != 0)
|
/*
|
||||||
identify_secondary_cpu(c);
|
* During boot time, CPU0 has this setup already. Save the info when
|
||||||
|
* bringing up AP or offlined CPU0.
|
||||||
|
*/
|
||||||
|
identify_secondary_cpu(c);
|
||||||
}
|
}
|
||||||
|
|
||||||
static bool __cpuinit
|
static bool __cpuinit
|
||||||
@ -483,7 +500,7 @@ void __inquire_remote_apic(int apicid)
|
|||||||
* won't ... remember to clear down the APIC, etc later.
|
* won't ... remember to clear down the APIC, etc later.
|
||||||
*/
|
*/
|
||||||
int __cpuinit
|
int __cpuinit
|
||||||
wakeup_secondary_cpu_via_nmi(int logical_apicid, unsigned long start_eip)
|
wakeup_secondary_cpu_via_nmi(int apicid, unsigned long start_eip)
|
||||||
{
|
{
|
||||||
unsigned long send_status, accept_status = 0;
|
unsigned long send_status, accept_status = 0;
|
||||||
int maxlvt;
|
int maxlvt;
|
||||||
@ -491,7 +508,7 @@ wakeup_secondary_cpu_via_nmi(int logical_apicid, unsigned long start_eip)
|
|||||||
/* Target chip */
|
/* Target chip */
|
||||||
/* Boot on the stack */
|
/* Boot on the stack */
|
||||||
/* Kick the second */
|
/* Kick the second */
|
||||||
apic_icr_write(APIC_DM_NMI | apic->dest_logical, logical_apicid);
|
apic_icr_write(APIC_DM_NMI | apic->dest_logical, apicid);
|
||||||
|
|
||||||
pr_debug("Waiting for send to finish...\n");
|
pr_debug("Waiting for send to finish...\n");
|
||||||
send_status = safe_apic_wait_icr_idle();
|
send_status = safe_apic_wait_icr_idle();
|
||||||
@ -651,6 +668,63 @@ static void __cpuinit announce_cpu(int cpu, int apicid)
|
|||||||
node, cpu, apicid);
|
node, cpu, apicid);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static int wakeup_cpu0_nmi(unsigned int cmd, struct pt_regs *regs)
|
||||||
|
{
|
||||||
|
int cpu;
|
||||||
|
|
||||||
|
cpu = smp_processor_id();
|
||||||
|
if (cpu == 0 && !cpu_online(cpu) && enable_start_cpu0)
|
||||||
|
return NMI_HANDLED;
|
||||||
|
|
||||||
|
return NMI_DONE;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Wake up AP by INIT, INIT, STARTUP sequence.
|
||||||
|
*
|
||||||
|
* Instead of waiting for STARTUP after INITs, BSP will execute the BIOS
|
||||||
|
* boot-strap code which is not a desired behavior for waking up BSP. To
|
||||||
|
* void the boot-strap code, wake up CPU0 by NMI instead.
|
||||||
|
*
|
||||||
|
* This works to wake up soft offlined CPU0 only. If CPU0 is hard offlined
|
||||||
|
* (i.e. physically hot removed and then hot added), NMI won't wake it up.
|
||||||
|
* We'll change this code in the future to wake up hard offlined CPU0 if
|
||||||
|
* real platform and request are available.
|
||||||
|
*/
|
||||||
|
static int __cpuinit
|
||||||
|
wakeup_cpu_via_init_nmi(int cpu, unsigned long start_ip, int apicid,
|
||||||
|
int *cpu0_nmi_registered)
|
||||||
|
{
|
||||||
|
int id;
|
||||||
|
int boot_error;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Wake up AP by INIT, INIT, STARTUP sequence.
|
||||||
|
*/
|
||||||
|
if (cpu)
|
||||||
|
return wakeup_secondary_cpu_via_init(apicid, start_ip);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Wake up BSP by nmi.
|
||||||
|
*
|
||||||
|
* Register a NMI handler to help wake up CPU0.
|
||||||
|
*/
|
||||||
|
boot_error = register_nmi_handler(NMI_LOCAL,
|
||||||
|
wakeup_cpu0_nmi, 0, "wake_cpu0");
|
||||||
|
|
||||||
|
if (!boot_error) {
|
||||||
|
enable_start_cpu0 = 1;
|
||||||
|
*cpu0_nmi_registered = 1;
|
||||||
|
if (apic->dest_logical == APIC_DEST_LOGICAL)
|
||||||
|
id = cpu0_logical_apicid;
|
||||||
|
else
|
||||||
|
id = apicid;
|
||||||
|
boot_error = wakeup_secondary_cpu_via_nmi(id, start_ip);
|
||||||
|
}
|
||||||
|
|
||||||
|
return boot_error;
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
|
* NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
|
||||||
* (ie clustered apic addressing mode), this is a LOGICAL apic ID.
|
* (ie clustered apic addressing mode), this is a LOGICAL apic ID.
|
||||||
@ -666,6 +740,7 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
|
|||||||
|
|
||||||
unsigned long boot_error = 0;
|
unsigned long boot_error = 0;
|
||||||
int timeout;
|
int timeout;
|
||||||
|
int cpu0_nmi_registered = 0;
|
||||||
|
|
||||||
/* Just in case we booted with a single CPU. */
|
/* Just in case we booted with a single CPU. */
|
||||||
alternatives_enable_smp();
|
alternatives_enable_smp();
|
||||||
@ -713,13 +788,16 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Kick the secondary CPU. Use the method in the APIC driver
|
* Wake up a CPU in difference cases:
|
||||||
* if it's defined - or use an INIT boot APIC message otherwise:
|
* - Use the method in the APIC driver if it's defined
|
||||||
|
* Otherwise,
|
||||||
|
* - Use an INIT boot APIC message for APs or NMI for BSP.
|
||||||
*/
|
*/
|
||||||
if (apic->wakeup_secondary_cpu)
|
if (apic->wakeup_secondary_cpu)
|
||||||
boot_error = apic->wakeup_secondary_cpu(apicid, start_ip);
|
boot_error = apic->wakeup_secondary_cpu(apicid, start_ip);
|
||||||
else
|
else
|
||||||
boot_error = wakeup_secondary_cpu_via_init(apicid, start_ip);
|
boot_error = wakeup_cpu_via_init_nmi(cpu, start_ip, apicid,
|
||||||
|
&cpu0_nmi_registered);
|
||||||
|
|
||||||
if (!boot_error) {
|
if (!boot_error) {
|
||||||
/*
|
/*
|
||||||
@ -784,6 +862,13 @@ static int __cpuinit do_boot_cpu(int apicid, int cpu, struct task_struct *idle)
|
|||||||
*/
|
*/
|
||||||
smpboot_restore_warm_reset_vector();
|
smpboot_restore_warm_reset_vector();
|
||||||
}
|
}
|
||||||
|
/*
|
||||||
|
* Clean up the nmi handler. Do this after the callin and callout sync
|
||||||
|
* to avoid impact of possible long unregister time.
|
||||||
|
*/
|
||||||
|
if (cpu0_nmi_registered)
|
||||||
|
unregister_nmi_handler(NMI_LOCAL, "wake_cpu0");
|
||||||
|
|
||||||
return boot_error;
|
return boot_error;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -797,7 +882,7 @@ int __cpuinit native_cpu_up(unsigned int cpu, struct task_struct *tidle)
|
|||||||
|
|
||||||
pr_debug("++++++++++++++++++++=_---CPU UP %u\n", cpu);
|
pr_debug("++++++++++++++++++++=_---CPU UP %u\n", cpu);
|
||||||
|
|
||||||
if (apicid == BAD_APICID || apicid == boot_cpu_physical_apicid ||
|
if (apicid == BAD_APICID ||
|
||||||
!physid_isset(apicid, phys_cpu_present_map) ||
|
!physid_isset(apicid, phys_cpu_present_map) ||
|
||||||
!apic->apic_id_valid(apicid)) {
|
!apic->apic_id_valid(apicid)) {
|
||||||
pr_err("%s: bad cpu %d\n", __func__, cpu);
|
pr_err("%s: bad cpu %d\n", __func__, cpu);
|
||||||
@ -995,7 +1080,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
|
|||||||
/*
|
/*
|
||||||
* Setup boot CPU information
|
* Setup boot CPU information
|
||||||
*/
|
*/
|
||||||
smp_store_cpu_info(0); /* Final full version of the data */
|
smp_store_boot_cpu_info(); /* Final full version of the data */
|
||||||
cpumask_copy(cpu_callin_mask, cpumask_of(0));
|
cpumask_copy(cpu_callin_mask, cpumask_of(0));
|
||||||
mb();
|
mb();
|
||||||
|
|
||||||
@ -1031,6 +1116,11 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
|
|||||||
*/
|
*/
|
||||||
setup_local_APIC();
|
setup_local_APIC();
|
||||||
|
|
||||||
|
if (x2apic_mode)
|
||||||
|
cpu0_logical_apicid = apic_read(APIC_LDR);
|
||||||
|
else
|
||||||
|
cpu0_logical_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Enable IO APIC before setting up error vector
|
* Enable IO APIC before setting up error vector
|
||||||
*/
|
*/
|
||||||
@ -1219,19 +1309,6 @@ void cpu_disable_common(void)
|
|||||||
|
|
||||||
int native_cpu_disable(void)
|
int native_cpu_disable(void)
|
||||||
{
|
{
|
||||||
int cpu = smp_processor_id();
|
|
||||||
|
|
||||||
/*
|
|
||||||
* Perhaps use cpufreq to drop frequency, but that could go
|
|
||||||
* into generic code.
|
|
||||||
*
|
|
||||||
* We won't take down the boot processor on i386 due to some
|
|
||||||
* interrupts only being able to be serviced by the BSP.
|
|
||||||
* Especially so if we're not using an IOAPIC -zwane
|
|
||||||
*/
|
|
||||||
if (cpu == 0)
|
|
||||||
return -EBUSY;
|
|
||||||
|
|
||||||
clear_local_APIC();
|
clear_local_APIC();
|
||||||
|
|
||||||
cpu_disable_common();
|
cpu_disable_common();
|
||||||
@ -1271,6 +1348,14 @@ void play_dead_common(void)
|
|||||||
local_irq_disable();
|
local_irq_disable();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static bool wakeup_cpu0(void)
|
||||||
|
{
|
||||||
|
if (smp_processor_id() == 0 && enable_start_cpu0)
|
||||||
|
return true;
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* We need to flush the caches before going to sleep, lest we have
|
* We need to flush the caches before going to sleep, lest we have
|
||||||
* dirty data in our caches when we come back up.
|
* dirty data in our caches when we come back up.
|
||||||
@ -1334,6 +1419,11 @@ static inline void mwait_play_dead(void)
|
|||||||
__monitor(mwait_ptr, 0, 0);
|
__monitor(mwait_ptr, 0, 0);
|
||||||
mb();
|
mb();
|
||||||
__mwait(eax, 0);
|
__mwait(eax, 0);
|
||||||
|
/*
|
||||||
|
* If NMI wants to wake up CPU0, start CPU0.
|
||||||
|
*/
|
||||||
|
if (wakeup_cpu0())
|
||||||
|
start_cpu0();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1344,6 +1434,11 @@ static inline void hlt_play_dead(void)
|
|||||||
|
|
||||||
while (1) {
|
while (1) {
|
||||||
native_halt();
|
native_halt();
|
||||||
|
/*
|
||||||
|
* If NMI wants to wake up CPU0, start CPU0.
|
||||||
|
*/
|
||||||
|
if (wakeup_cpu0())
|
||||||
|
start_cpu0();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -30,23 +30,110 @@
|
|||||||
#include <linux/mmzone.h>
|
#include <linux/mmzone.h>
|
||||||
#include <linux/init.h>
|
#include <linux/init.h>
|
||||||
#include <linux/smp.h>
|
#include <linux/smp.h>
|
||||||
|
#include <linux/irq.h>
|
||||||
#include <asm/cpu.h>
|
#include <asm/cpu.h>
|
||||||
|
|
||||||
static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
|
static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
|
||||||
|
|
||||||
#ifdef CONFIG_HOTPLUG_CPU
|
#ifdef CONFIG_HOTPLUG_CPU
|
||||||
|
|
||||||
|
#ifdef CONFIG_BOOTPARAM_HOTPLUG_CPU0
|
||||||
|
static int cpu0_hotpluggable = 1;
|
||||||
|
#else
|
||||||
|
static int cpu0_hotpluggable;
|
||||||
|
static int __init enable_cpu0_hotplug(char *str)
|
||||||
|
{
|
||||||
|
cpu0_hotpluggable = 1;
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
__setup("cpu0_hotplug", enable_cpu0_hotplug);
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#ifdef CONFIG_DEBUG_HOTPLUG_CPU0
|
||||||
|
/*
|
||||||
|
* This function offlines a CPU as early as possible and allows userspace to
|
||||||
|
* boot up without the CPU. The CPU can be onlined back by user after boot.
|
||||||
|
*
|
||||||
|
* This is only called for debugging CPU offline/online feature.
|
||||||
|
*/
|
||||||
|
int __ref _debug_hotplug_cpu(int cpu, int action)
|
||||||
|
{
|
||||||
|
struct device *dev = get_cpu_device(cpu);
|
||||||
|
int ret;
|
||||||
|
|
||||||
|
if (!cpu_is_hotpluggable(cpu))
|
||||||
|
return -EINVAL;
|
||||||
|
|
||||||
|
cpu_hotplug_driver_lock();
|
||||||
|
|
||||||
|
switch (action) {
|
||||||
|
case 0:
|
||||||
|
ret = cpu_down(cpu);
|
||||||
|
if (!ret) {
|
||||||
|
pr_info("CPU %u is now offline\n", cpu);
|
||||||
|
kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
|
||||||
|
} else
|
||||||
|
pr_debug("Can't offline CPU%d.\n", cpu);
|
||||||
|
break;
|
||||||
|
case 1:
|
||||||
|
ret = cpu_up(cpu);
|
||||||
|
if (!ret)
|
||||||
|
kobject_uevent(&dev->kobj, KOBJ_ONLINE);
|
||||||
|
else
|
||||||
|
pr_debug("Can't online CPU%d.\n", cpu);
|
||||||
|
break;
|
||||||
|
default:
|
||||||
|
ret = -EINVAL;
|
||||||
|
}
|
||||||
|
|
||||||
|
cpu_hotplug_driver_unlock();
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int __init debug_hotplug_cpu(void)
|
||||||
|
{
|
||||||
|
_debug_hotplug_cpu(0, 0);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
late_initcall_sync(debug_hotplug_cpu);
|
||||||
|
#endif /* CONFIG_DEBUG_HOTPLUG_CPU0 */
|
||||||
|
|
||||||
int __ref arch_register_cpu(int num)
|
int __ref arch_register_cpu(int num)
|
||||||
{
|
{
|
||||||
|
struct cpuinfo_x86 *c = &cpu_data(num);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* CPU0 cannot be offlined due to several
|
* Currently CPU0 is only hotpluggable on Intel platforms. Other
|
||||||
* restrictions and assumptions in kernel. This basically
|
* vendors can add hotplug support later.
|
||||||
* doesn't add a control file, one cannot attempt to offline
|
|
||||||
* BSP.
|
|
||||||
*
|
|
||||||
* Also certain PCI quirks require not to enable hotplug control
|
|
||||||
* for all CPU's.
|
|
||||||
*/
|
*/
|
||||||
if (num)
|
if (c->x86_vendor != X86_VENDOR_INTEL)
|
||||||
|
cpu0_hotpluggable = 0;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Two known BSP/CPU0 dependencies: Resume from suspend/hibernate
|
||||||
|
* depends on BSP. PIC interrupts depend on BSP.
|
||||||
|
*
|
||||||
|
* If the BSP depencies are under control, one can tell kernel to
|
||||||
|
* enable BSP hotplug. This basically adds a control file and
|
||||||
|
* one can attempt to offline BSP.
|
||||||
|
*/
|
||||||
|
if (num == 0 && cpu0_hotpluggable) {
|
||||||
|
unsigned int irq;
|
||||||
|
/*
|
||||||
|
* We won't take down the boot processor on i386 if some
|
||||||
|
* interrupts only are able to be serviced by the BSP in PIC.
|
||||||
|
*/
|
||||||
|
for_each_active_irq(irq) {
|
||||||
|
if (!IO_APIC_IRQ(irq) && irq_has_action(irq)) {
|
||||||
|
cpu0_hotpluggable = 0;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (num || cpu0_hotpluggable)
|
||||||
per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
|
per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
|
||||||
|
|
||||||
return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
|
return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
|
||||||
|
@ -21,6 +21,7 @@
|
|||||||
#include <asm/suspend.h>
|
#include <asm/suspend.h>
|
||||||
#include <asm/debugreg.h>
|
#include <asm/debugreg.h>
|
||||||
#include <asm/fpu-internal.h> /* pcntxt_mask */
|
#include <asm/fpu-internal.h> /* pcntxt_mask */
|
||||||
|
#include <asm/cpu.h>
|
||||||
|
|
||||||
#ifdef CONFIG_X86_32
|
#ifdef CONFIG_X86_32
|
||||||
static struct saved_context saved_context;
|
static struct saved_context saved_context;
|
||||||
@ -237,3 +238,84 @@ void restore_processor_state(void)
|
|||||||
#ifdef CONFIG_X86_32
|
#ifdef CONFIG_X86_32
|
||||||
EXPORT_SYMBOL(restore_processor_state);
|
EXPORT_SYMBOL(restore_processor_state);
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
/*
|
||||||
|
* When bsp_check() is called in hibernate and suspend, cpu hotplug
|
||||||
|
* is disabled already. So it's unnessary to handle race condition between
|
||||||
|
* cpumask query and cpu hotplug.
|
||||||
|
*/
|
||||||
|
static int bsp_check(void)
|
||||||
|
{
|
||||||
|
if (cpumask_first(cpu_online_mask) != 0) {
|
||||||
|
pr_warn("CPU0 is offline.\n");
|
||||||
|
return -ENODEV;
|
||||||
|
}
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static int bsp_pm_callback(struct notifier_block *nb, unsigned long action,
|
||||||
|
void *ptr)
|
||||||
|
{
|
||||||
|
int ret = 0;
|
||||||
|
|
||||||
|
switch (action) {
|
||||||
|
case PM_SUSPEND_PREPARE:
|
||||||
|
case PM_HIBERNATION_PREPARE:
|
||||||
|
ret = bsp_check();
|
||||||
|
break;
|
||||||
|
#ifdef CONFIG_DEBUG_HOTPLUG_CPU0
|
||||||
|
case PM_RESTORE_PREPARE:
|
||||||
|
/*
|
||||||
|
* When system resumes from hibernation, online CPU0 because
|
||||||
|
* 1. it's required for resume and
|
||||||
|
* 2. the CPU was online before hibernation
|
||||||
|
*/
|
||||||
|
if (!cpu_online(0))
|
||||||
|
_debug_hotplug_cpu(0, 1);
|
||||||
|
break;
|
||||||
|
case PM_POST_RESTORE:
|
||||||
|
/*
|
||||||
|
* When a resume really happens, this code won't be called.
|
||||||
|
*
|
||||||
|
* This code is called only when user space hibernation software
|
||||||
|
* prepares for snapshot device during boot time. So we just
|
||||||
|
* call _debug_hotplug_cpu() to restore to CPU0's state prior to
|
||||||
|
* preparing the snapshot device.
|
||||||
|
*
|
||||||
|
* This works for normal boot case in our CPU0 hotplug debug
|
||||||
|
* mode, i.e. CPU0 is offline and user mode hibernation
|
||||||
|
* software initializes during boot time.
|
||||||
|
*
|
||||||
|
* If CPU0 is online and user application accesses snapshot
|
||||||
|
* device after boot time, this will offline CPU0 and user may
|
||||||
|
* see different CPU0 state before and after accessing
|
||||||
|
* the snapshot device. But hopefully this is not a case when
|
||||||
|
* user debugging CPU0 hotplug. Even if users hit this case,
|
||||||
|
* they can easily online CPU0 back.
|
||||||
|
*
|
||||||
|
* To simplify this debug code, we only consider normal boot
|
||||||
|
* case. Otherwise we need to remember CPU0's state and restore
|
||||||
|
* to that state and resolve racy conditions etc.
|
||||||
|
*/
|
||||||
|
_debug_hotplug_cpu(0, 0);
|
||||||
|
break;
|
||||||
|
#endif
|
||||||
|
default:
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
return notifier_from_errno(ret);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int __init bsp_pm_check_init(void)
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* Set this bsp_pm_callback as lower priority than
|
||||||
|
* cpu_hotplug_pm_callback. So cpu_hotplug_pm_callback will be called
|
||||||
|
* earlier to disable cpu hotplug before bsp online check.
|
||||||
|
*/
|
||||||
|
pm_notifier(bsp_pm_callback, -INT_MAX);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
core_initcall(bsp_pm_check_init);
|
||||||
|
@ -603,6 +603,11 @@ cpu_hotplug_pm_callback(struct notifier_block *nb,
|
|||||||
|
|
||||||
static int __init cpu_hotplug_pm_sync_init(void)
|
static int __init cpu_hotplug_pm_sync_init(void)
|
||||||
{
|
{
|
||||||
|
/*
|
||||||
|
* cpu_hotplug_pm_callback has higher priority than x86
|
||||||
|
* bsp_pm_callback which depends on cpu_hotplug_pm_callback
|
||||||
|
* to disable cpu hotplug to avoid cpu hotplug race.
|
||||||
|
*/
|
||||||
pm_notifier(cpu_hotplug_pm_callback, 0);
|
pm_notifier(cpu_hotplug_pm_callback, 0);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
Loading…
Reference in New Issue
Block a user