Power management updates for 6.5-rc1

- Introduce power capping core support for Intel TPMI (Topology Aware
    Register and PM Capsule Interface) and a TPMI interface driver for
    Intel RAPL (Zhang Rui, Dan Carpenter).
 
  - Fix CONFIG_IOSF_MBI dependency in the Intel RAPL power capping
    driver (Zhang Rui).
 
  - Fix invalid initialization for pl4_supported field in the Intel RAPL
    power capping driver (Sumeet Pawnikar).
 
  - Clean up the intel_idle driver, make it work with VM guests that
    cannot use the MWAIT instruction and address the case in which the
    host may enter a deep idle state when the guest is idle (Arjan van
    de Ven).
 
  - Prevent cpufreq drivers that provide the ->adjust_perf() callback
    without a ->fast_switch() one which is used as a fallback from the
    former in some cases (Wyes Karny).
 
  - Fix some issues related to the AMD P-state cpufreq driver (Mario
    Limonciello, Wyes Karny).
 
  - Fix the energy_performance_preference attribute handling in the
    intel_pstate driver in passive mode (Tero Kristo).
 
  - Fix the handling of pm_suspend_target_state when CONFIG_PM is unset
    (Kai-Heng Feng).
 
  - Correct spelling mistake in a comment in the hibernation code (Wang
    Honghui).
 
  - Add arch_resume_nosmt() prototype to avoid a "missing prototypes"
    build warning (Arnd Bergmann).
 
  - Restrict pm_pr_dbg() to system-wide power transitions and use it in
    a few additional places (Mario Limonciello).
 
  - Drop verification of in-params from genpd_add_device() and ensure
    that all of its callers will do it (Ulf Hansson).
 
  - Prevent possible integer overflows from occurring in
    genpd_parse_state() (Nikita Zhandarovich).
 
  - Reorder fieldls in 'struct devfreq_dev_status' to reduce its size
    somewhat (Christophe JAILLET).
 
  - Ensure that the Exynos PPMU driver is already loaded before the
    Exynos Bus driver starts probing so as to avoid a possible freeze
    loading of the kernel modules (Marek Szyprowski).
 
  - Fix variable deferencing before NULL check in the mtk-cci devfreq
    driver (Sukrut Bellary).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmSZwPASHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxu44P/AouvVFDMt+eE76nPfNc10X1lswS0vrT
 X7LmylSDPyWiuz6p8MWGXB6T2nQ+3DbvSfVyBJ960ymkBnE/F9me8o3wB8eGbd6z
 ZvOD8+wVPXS4Cq8gUxy2zV1ul+o5IwwT20cYC6mWjasvByl13vTevN5d6ZQ9o6hS
 1hAQQDd6JjsdLIUyU0EbE4aD+l4h96o45IFxbV86qVH77ywa6VMNdulRKmDcONj3
 kM7jHFYL4xl0TfMjHp4IhGWXK32qGYgX1zYTOU5kSc11IExJfVzQcL2uQ9A0KSLp
 RJ0c93loUsHdMhenNkN4nSBFWBIaftKDLbS+5Ubt0DBuNN7kxWivEVts4DM/wxuB
 72PNl5h8YglcW7LHH2IXb/6HEerzbj42+6y459o+M0DcNTq18gu19OQTK5IGtRrQ
 Yf6+5BhgLR3R1REg0eaBg6njtGq0f5fmW7Iqo52eA8cXhHU0MTDJE1p6ytfN40gH
 ViA+T8HB6Mh91lWHVftbwo3wONHGcfJy+S2hGM45V5LKEGeKHILgmw0nUzO7epWE
 7VIPKGzkVd7h/Drk7X3nQR3DJFA/x5eNhjxt5LZD83cVVg34SS3ST5oH13FI9S7Q
 8zwG5KoHTDrmYug3sxQ+Q9pq8MnOl0ZbgqlVfwyiWjKYmNMbg4elsQG/2zD3t0kv
 y8zXbr7Kr4QB
 =SZV7
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These add Intel TPMI (Topology Aware Register and PM Capsule
  Interface) support to the power capping subsystem, extend the
  intel_idle driver to work in VM guests where MWAIT is not available,
  extend the system-wide power management diagnostics, fix bugs and
  clean up code.

  Specifics:

   - Introduce power capping core support for Intel TPMI (Topology Aware
     Register and PM Capsule Interface) and a TPMI interface driver for
     Intel RAPL (Zhang Rui, Dan Carpenter)

   - Fix CONFIG_IOSF_MBI dependency in the Intel RAPL power capping
     driver (Zhang Rui)

   - Fix invalid initialization for pl4_supported field in the Intel
     RAPL power capping driver (Sumeet Pawnikar)

   - Clean up the intel_idle driver, make it work with VM guests that
     cannot use the MWAIT instruction and address the case in which the
     host may enter a deep idle state when the guest is idle (Arjan van
     de Ven)

   - Prevent cpufreq drivers that provide the ->adjust_perf() callback
     without a ->fast_switch() one which is used as a fallback from the
     former in some cases (Wyes Karny)

   - Fix some issues related to the AMD P-state cpufreq driver (Mario
     Limonciello, Wyes Karny)

   - Fix the energy_performance_preference attribute handling in the
     intel_pstate driver in passive mode (Tero Kristo)

   - Fix the handling of pm_suspend_target_state when CONFIG_PM is unset
     (Kai-Heng Feng)

   - Correct spelling mistake in a comment in the hibernation code (Wang
     Honghui)

   - Add arch_resume_nosmt() prototype to avoid a "missing prototypes"
     build warning (Arnd Bergmann)

   - Restrict pm_pr_dbg() to system-wide power transitions and use it in
     a few additional places (Mario Limonciello)

   - Drop verification of in-params from genpd_add_device() and ensure
     that all of its callers will do it (Ulf Hansson)

   - Prevent possible integer overflows from occurring in
     genpd_parse_state() (Nikita Zhandarovich)

   - Reorder fieldls in 'struct devfreq_dev_status' to reduce its size
     somewhat (Christophe JAILLET)

   - Ensure that the Exynos PPMU driver is already loaded before the
     Exynos Bus driver starts probing so as to avoid a possible freeze
     loading of the kernel modules (Marek Szyprowski)

   - Fix variable deferencing before NULL check in the mtk-cci devfreq
     driver (Sukrut Bellary)"

* tag 'pm-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (42 commits)
  intel_idle: Add a "Long HLT" C1 state for the VM guest mode
  cpufreq: intel_pstate: Fix energy_performance_preference for passive
  cpufreq: amd-pstate: Add a kernel config option to set default mode
  cpufreq: amd-pstate: Set a fallback policy based on preferred_profile
  ACPI: CPPC: Add definition for undefined FADT preferred PM profile value
  cpufreq: amd-pstate: Set default governor to schedutil
  PM: domains: Move the verification of in-params from genpd_add_device()
  cpufreq: amd-pstate: Make amd-pstate EPP driver name hyphenated
  cpufreq: amd-pstate: Write CPPC enable bit per-socket
  intel_idle: Add support for using intel_idle in a VM guest using just hlt
  cpufreq: Fail driver register if it has adjust_perf without fast_switch
  intel_idle: clean up the (new) state_update_enter_method function
  intel_idle: refactor state->enter manipulation into its own function
  platform/x86/amd: pmc: Use pm_pr_dbg() for suspend related messages
  pinctrl: amd: Use pm_pr_dbg to show debugging messages
  ACPI: x86: Add pm_debug_messages for LPS0 _DSM state tracking
  include/linux/suspend.h: Only show pm_pr_dbg messages at suspend/resume
  powercap: RAPL: Fix a NULL vs IS_ERR() bug
  powercap: RAPL: Fix CONFIG_IOSF_MBI dependency
  powercap: RAPL: fix invalid initialization for pl4_supported field
  ...
This commit is contained in:
Linus Torvalds 2023-06-26 19:36:30 -07:00
commit 40e8e98f51
27 changed files with 1326 additions and 496 deletions

View File

@ -59,6 +59,7 @@ static int lps0_dsm_func_mask;
static guid_t lps0_dsm_guid_microsoft;
static int lps0_dsm_func_mask_microsoft;
static int lps0_dsm_state;
/* Device constraint entry structure */
struct lpi_device_info {
@ -320,6 +321,44 @@ static void lpi_check_constraints(void)
}
}
static bool acpi_s2idle_vendor_amd(void)
{
return boot_cpu_data.x86_vendor == X86_VENDOR_AMD;
}
static const char *acpi_sleep_dsm_state_to_str(unsigned int state)
{
if (lps0_dsm_func_mask_microsoft || !acpi_s2idle_vendor_amd()) {
switch (state) {
case ACPI_LPS0_SCREEN_OFF:
return "screen off";
case ACPI_LPS0_SCREEN_ON:
return "screen on";
case ACPI_LPS0_ENTRY:
return "lps0 entry";
case ACPI_LPS0_EXIT:
return "lps0 exit";
case ACPI_LPS0_MS_ENTRY:
return "lps0 ms entry";
case ACPI_LPS0_MS_EXIT:
return "lps0 ms exit";
}
} else {
switch (state) {
case ACPI_LPS0_SCREEN_ON_AMD:
return "screen on";
case ACPI_LPS0_SCREEN_OFF_AMD:
return "screen off";
case ACPI_LPS0_ENTRY_AMD:
return "lps0 entry";
case ACPI_LPS0_EXIT_AMD:
return "lps0 exit";
}
}
return "unknown";
}
static void acpi_sleep_run_lps0_dsm(unsigned int func, unsigned int func_mask, guid_t dsm_guid)
{
union acpi_object *out_obj;
@ -331,14 +370,15 @@ static void acpi_sleep_run_lps0_dsm(unsigned int func, unsigned int func_mask, g
rev_id, func, NULL);
ACPI_FREE(out_obj);
acpi_handle_debug(lps0_device_handle, "_DSM function %u evaluation %s\n",
func, out_obj ? "successful" : "failed");
lps0_dsm_state = func;
if (pm_debug_messages_on) {
acpi_handle_info(lps0_device_handle,
"%s transitioned to state %s\n",
out_obj ? "Successfully" : "Failed to",
acpi_sleep_dsm_state_to_str(lps0_dsm_state));
}
}
static bool acpi_s2idle_vendor_amd(void)
{
return boot_cpu_data.x86_vendor == X86_VENDOR_AMD;
}
static int validate_dsm(acpi_handle handle, const char *uuid, int rev, guid_t *dsm_guid)
{

View File

@ -1632,9 +1632,6 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev,
dev_dbg(dev, "%s()\n", __func__);
if (IS_ERR_OR_NULL(genpd) || IS_ERR_OR_NULL(dev))
return -EINVAL;
gpd_data = genpd_alloc_dev_data(dev, gd);
if (IS_ERR(gpd_data))
return PTR_ERR(gpd_data);
@ -1676,6 +1673,9 @@ int pm_genpd_add_device(struct generic_pm_domain *genpd, struct device *dev)
{
int ret;
if (!genpd || !dev)
return -EINVAL;
mutex_lock(&gpd_list_lock);
ret = genpd_add_device(genpd, dev, dev);
mutex_unlock(&gpd_list_lock);
@ -2523,6 +2523,9 @@ int of_genpd_add_device(struct of_phandle_args *genpdspec, struct device *dev)
struct generic_pm_domain *genpd;
int ret;
if (!dev)
return -EINVAL;
mutex_lock(&gpd_list_lock);
genpd = genpd_get_from_provider(genpdspec);
@ -2939,10 +2942,10 @@ static int genpd_parse_state(struct genpd_power_state *genpd_state,
err = of_property_read_u32(state_node, "min-residency-us", &residency);
if (!err)
genpd_state->residency_ns = 1000 * residency;
genpd_state->residency_ns = 1000LL * residency;
genpd_state->power_on_latency_ns = 1000 * exit_latency;
genpd_state->power_off_latency_ns = 1000 * entry_latency;
genpd_state->power_on_latency_ns = 1000LL * exit_latency;
genpd_state->power_off_latency_ns = 1000LL * entry_latency;
genpd_state->fwnode = &state_node->fwnode;
return 0;

View File

@ -19,11 +19,6 @@
#include "power.h"
#ifndef CONFIG_SUSPEND
suspend_state_t pm_suspend_target_state;
#define pm_suspend_target_state (PM_SUSPEND_ON)
#endif
#define list_for_each_entry_rcu_locked(pos, head, member) \
list_for_each_entry_rcu(pos, head, member, \
srcu_read_lock_held(&wakeup_srcu))

View File

@ -38,7 +38,7 @@ choice
prompt "Default CPUFreq governor"
default CPU_FREQ_DEFAULT_GOV_USERSPACE if ARM_SA1110_CPUFREQ
default CPU_FREQ_DEFAULT_GOV_SCHEDUTIL if ARM64 || ARM
default CPU_FREQ_DEFAULT_GOV_SCHEDUTIL if X86_INTEL_PSTATE && SMP
default CPU_FREQ_DEFAULT_GOV_SCHEDUTIL if (X86_INTEL_PSTATE || X86_AMD_PSTATE) && SMP
default CPU_FREQ_DEFAULT_GOV_PERFORMANCE
help
This option sets which CPUFreq governor shall be loaded at

View File

@ -51,6 +51,23 @@ config X86_AMD_PSTATE
If in doubt, say N.
config X86_AMD_PSTATE_DEFAULT_MODE
int "AMD Processor P-State default mode"
depends on X86_AMD_PSTATE
default 3 if X86_AMD_PSTATE
range 1 4
help
Select the default mode the amd-pstate driver will use on
supported hardware.
The value set has the following meanings:
1 -> Disabled
2 -> Passive
3 -> Active (EPP)
4 -> Guided
For details, take a look at:
<file:Documentation/admin-guide/pm/amd-pstate.rst>.
config X86_AMD_PSTATE_UT
tristate "selftest for AMD Processor P-State driver"
depends on X86 && ACPI_PROCESSOR

View File

@ -62,7 +62,8 @@
static struct cpufreq_driver *current_pstate_driver;
static struct cpufreq_driver amd_pstate_driver;
static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_DISABLE;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool cppc_enabled;
/*
* AMD Energy Preference Performance (EPP)
@ -228,7 +229,28 @@ static int amd_pstate_set_energy_pref_index(struct amd_cpudata *cpudata,
static inline int pstate_enable(bool enable)
{
return wrmsrl_safe(MSR_AMD_CPPC_ENABLE, enable);
int ret, cpu;
unsigned long logical_proc_id_mask = 0;
if (enable == cppc_enabled)
return 0;
for_each_present_cpu(cpu) {
unsigned long logical_id = topology_logical_die_id(cpu);
if (test_bit(logical_id, &logical_proc_id_mask))
continue;
set_bit(logical_id, &logical_proc_id_mask);
ret = wrmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_ENABLE,
enable);
if (ret)
return ret;
}
cppc_enabled = enable;
return 0;
}
static int cppc_enable(bool enable)
@ -236,6 +258,9 @@ static int cppc_enable(bool enable)
int cpu, ret = 0;
struct cppc_perf_ctrls perf_ctrls;
if (enable == cppc_enabled)
return 0;
for_each_present_cpu(cpu) {
ret = cppc_set_enable(cpu, enable);
if (ret)
@ -251,6 +276,7 @@ static int cppc_enable(bool enable)
}
}
cppc_enabled = enable;
return ret;
}
@ -1045,6 +1071,26 @@ static const struct attribute_group amd_pstate_global_attr_group = {
.attrs = pstate_global_attributes,
};
static bool amd_pstate_acpi_pm_profile_server(void)
{
switch (acpi_gbl_FADT.preferred_profile) {
case PM_ENTERPRISE_SERVER:
case PM_SOHO_SERVER:
case PM_PERFORMANCE_SERVER:
return true;
}
return false;
}
static bool amd_pstate_acpi_pm_profile_undefined(void)
{
if (acpi_gbl_FADT.preferred_profile == PM_UNSPECIFIED)
return true;
if (acpi_gbl_FADT.preferred_profile >= NR_PM_PROFILES)
return true;
return false;
}
static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
@ -1102,10 +1148,14 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
policy->max = policy->cpuinfo.max_freq;
/*
* Set the policy to powersave to provide a valid fallback value in case
* Set the policy to provide a valid fallback value in case
* the default cpufreq governor is neither powersave nor performance.
*/
policy->policy = CPUFREQ_POLICY_POWERSAVE;
if (amd_pstate_acpi_pm_profile_server() ||
amd_pstate_acpi_pm_profile_undefined())
policy->policy = CPUFREQ_POLICY_PERFORMANCE;
else
policy->policy = CPUFREQ_POLICY_POWERSAVE;
if (boot_cpu_has(X86_FEATURE_CPPC)) {
ret = rdmsrl_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, &value);
@ -1356,10 +1406,29 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
.online = amd_pstate_epp_cpu_online,
.suspend = amd_pstate_epp_suspend,
.resume = amd_pstate_epp_resume,
.name = "amd_pstate_epp",
.name = "amd-pstate-epp",
.attr = amd_pstate_epp_attr,
};
static int __init amd_pstate_set_driver(int mode_idx)
{
if (mode_idx >= AMD_PSTATE_DISABLE && mode_idx < AMD_PSTATE_MAX) {
cppc_state = mode_idx;
if (cppc_state == AMD_PSTATE_DISABLE)
pr_info("driver is explicitly disabled\n");
if (cppc_state == AMD_PSTATE_ACTIVE)
current_pstate_driver = &amd_pstate_epp_driver;
if (cppc_state == AMD_PSTATE_PASSIVE || cppc_state == AMD_PSTATE_GUIDED)
current_pstate_driver = &amd_pstate_driver;
return 0;
}
return -EINVAL;
}
static int __init amd_pstate_init(void)
{
struct device *dev_root;
@ -1367,15 +1436,6 @@ static int __init amd_pstate_init(void)
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
return -ENODEV;
/*
* by default the pstate driver is disabled to load
* enable the amd_pstate passive mode driver explicitly
* with amd_pstate=passive or other modes in kernel command line
*/
if (cppc_state == AMD_PSTATE_DISABLE) {
pr_info("driver load is disabled, boot with specific mode to enable this\n");
return -ENODEV;
}
if (!acpi_cpc_valid()) {
pr_warn_once("the _CPC object is not present in SBIOS or ACPI disabled\n");
@ -1386,6 +1446,33 @@ static int __init amd_pstate_init(void)
if (cpufreq_get_current_driver())
return -EEXIST;
switch (cppc_state) {
case AMD_PSTATE_UNDEFINED:
/* Disable on the following configs by default:
* 1. Undefined platforms
* 2. Server platforms
* 3. Shared memory designs
*/
if (amd_pstate_acpi_pm_profile_undefined() ||
amd_pstate_acpi_pm_profile_server() ||
!boot_cpu_has(X86_FEATURE_CPPC)) {
pr_info("driver load is disabled, boot with specific mode to enable this\n");
return -ENODEV;
}
ret = amd_pstate_set_driver(CONFIG_X86_AMD_PSTATE_DEFAULT_MODE);
if (ret)
return ret;
break;
case AMD_PSTATE_DISABLE:
return -ENODEV;
case AMD_PSTATE_PASSIVE:
case AMD_PSTATE_ACTIVE:
case AMD_PSTATE_GUIDED:
break;
default:
return -EINVAL;
}
/* capability check */
if (boot_cpu_has(X86_FEATURE_CPPC)) {
pr_debug("AMD CPPC MSR based functionality is supported\n");
@ -1438,21 +1525,7 @@ static int __init amd_pstate_param(char *str)
size = strlen(str);
mode_idx = get_mode_idx_from_str(str, size);
if (mode_idx >= AMD_PSTATE_DISABLE && mode_idx < AMD_PSTATE_MAX) {
cppc_state = mode_idx;
if (cppc_state == AMD_PSTATE_DISABLE)
pr_info("driver is explicitly disabled\n");
if (cppc_state == AMD_PSTATE_ACTIVE)
current_pstate_driver = &amd_pstate_epp_driver;
if (cppc_state == AMD_PSTATE_PASSIVE || cppc_state == AMD_PSTATE_GUIDED)
current_pstate_driver = &amd_pstate_driver;
return 0;
}
return -EINVAL;
return amd_pstate_set_driver(mode_idx);
}
early_param("amd_pstate", amd_pstate_param);

View File

@ -2828,7 +2828,8 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data)
(driver_data->setpolicy && (driver_data->target_index ||
driver_data->target)) ||
(!driver_data->get_intermediate != !driver_data->target_intermediate) ||
(!driver_data->online != !driver_data->offline))
(!driver_data->online != !driver_data->offline) ||
(driver_data->adjust_perf && !driver_data->fast_switch))
return -EINVAL;
pr_debug("trying to register driver %s\n", driver_data->name);

View File

@ -824,6 +824,8 @@ static ssize_t store_energy_performance_preference(
err = cpufreq_start_governor(policy);
if (!ret)
ret = err;
} else {
ret = 0;
}
}

View File

@ -518,6 +518,7 @@ static struct platform_driver exynos_bus_platdrv = {
};
module_platform_driver(exynos_bus_platdrv);
MODULE_SOFTDEP("pre: exynos_ppmu");
MODULE_DESCRIPTION("Generic Exynos Bus frequency driver");
MODULE_AUTHOR("Chanwoo Choi <cw00.choi@samsung.com>");
MODULE_LICENSE("GPL v2");

View File

@ -127,7 +127,7 @@ static int mtk_ccifreq_target(struct device *dev, unsigned long *freq,
u32 flags)
{
struct mtk_ccifreq_drv *drv = dev_get_drvdata(dev);
struct clk *cci_pll = clk_get_parent(drv->cci_clk);
struct clk *cci_pll;
struct dev_pm_opp *opp;
unsigned long opp_rate;
int voltage, pre_voltage, inter_voltage, target_voltage, ret;
@ -139,6 +139,7 @@ static int mtk_ccifreq_target(struct device *dev, unsigned long *freq,
return 0;
inter_voltage = drv->inter_voltage;
cci_pll = clk_get_parent(drv->cci_clk);
opp_rate = *freq;
opp = devfreq_recommended_opp(dev, &opp_rate, 1);

View File

@ -199,6 +199,43 @@ static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev,
return __intel_idle(dev, drv, index);
}
static __always_inline int __intel_idle_hlt(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
raw_safe_halt();
raw_local_irq_disable();
return index;
}
/**
* intel_idle_hlt - Ask the processor to enter the given idle state using hlt.
* @dev: cpuidle device of the target CPU.
* @drv: cpuidle driver (assumed to point to intel_idle_driver).
* @index: Target idle state index.
*
* Use the HLT instruction to notify the processor that the CPU represented by
* @dev is idle and it can try to enter the idle state corresponding to @index.
*
* Must be called under local_irq_disable().
*/
static __cpuidle int intel_idle_hlt(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
return __intel_idle_hlt(dev, drv, index);
}
static __cpuidle int intel_idle_hlt_irq_on(struct cpuidle_device *dev,
struct cpuidle_driver *drv, int index)
{
int ret;
raw_local_irq_enable();
ret = __intel_idle_hlt(dev, drv, index);
raw_local_irq_disable();
return ret;
}
/**
* intel_idle_s2idle - Ask the processor to enter the given idle state.
* @dev: cpuidle device of the target CPU.
@ -1242,6 +1279,25 @@ static struct cpuidle_state snr_cstates[] __initdata = {
.enter = NULL }
};
static struct cpuidle_state vmguest_cstates[] __initdata = {
{
.name = "C1",
.desc = "HLT",
.flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
.exit_latency = 5,
.target_residency = 10,
.enter = &intel_idle_hlt, },
{
.name = "C1L",
.desc = "Long HLT",
.flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TLB_FLUSHED,
.exit_latency = 5,
.target_residency = 200,
.enter = &intel_idle_hlt, },
{
.enter = NULL }
};
static const struct idle_cpu idle_cpu_nehalem __initconst = {
.state_table = nehalem_cstates,
.auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE,
@ -1839,6 +1895,66 @@ static bool __init intel_idle_verify_cstate(unsigned int mwait_hint)
return true;
}
static void state_update_enter_method(struct cpuidle_state *state, int cstate)
{
if (state->enter == intel_idle_hlt) {
if (force_irq_on) {
pr_info("forced intel_idle_irq for state %d\n", cstate);
state->enter = intel_idle_hlt_irq_on;
}
return;
}
if (state->enter == intel_idle_hlt_irq_on)
return; /* no update scenarios */
if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) {
/*
* Combining with XSTATE with IBRS or IRQ_ENABLE flags
* is not currently supported but this driver.
*/
WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IBRS);
WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
state->enter = intel_idle_xstate;
return;
}
if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
state->flags & CPUIDLE_FLAG_IBRS) {
/*
* IBRS mitigation requires that C-states are entered
* with interrupts disabled.
*/
WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
state->enter = intel_idle_ibrs;
return;
}
if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) {
state->enter = intel_idle_irq;
return;
}
if (force_irq_on) {
pr_info("forced intel_idle_irq for state %d\n", cstate);
state->enter = intel_idle_irq;
}
}
/*
* For mwait based states, we want to verify the cpuid data to see if the state
* is actually supported by this specific CPU.
* For non-mwait based states, this check should be skipped.
*/
static bool should_verify_mwait(struct cpuidle_state *state)
{
if (state->enter == intel_idle_hlt)
return false;
if (state->enter == intel_idle_hlt_irq_on)
return false;
return true;
}
static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
{
int cstate;
@ -1887,35 +2003,15 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
}
mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags);
if (!intel_idle_verify_cstate(mwait_hint))
if (should_verify_mwait(&cpuidle_state_table[cstate]) && !intel_idle_verify_cstate(mwait_hint))
continue;
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];
state = &drv->states[drv->state_count];
if (state->flags & CPUIDLE_FLAG_INIT_XSTATE) {
/*
* Combining with XSTATE with IBRS or IRQ_ENABLE flags
* is not currently supported but this driver.
*/
WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IBRS);
WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
state->enter = intel_idle_xstate;
} else if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS) &&
state->flags & CPUIDLE_FLAG_IBRS) {
/*
* IBRS mitigation requires that C-states are entered
* with interrupts disabled.
*/
WARN_ON_ONCE(state->flags & CPUIDLE_FLAG_IRQ_ENABLE);
state->enter = intel_idle_ibrs;
} else if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE) {
state->enter = intel_idle_irq;
} else if (force_irq_on) {
pr_info("forced intel_idle_irq for state %d\n", cstate);
state->enter = intel_idle_irq;
}
state_update_enter_method(state, cstate);
if ((disabled_states_mask & BIT(drv->state_count)) ||
((icpu->use_acpi || force_use_acpi) &&
@ -2041,6 +2137,93 @@ static void __init intel_idle_cpuidle_devices_uninit(void)
cpuidle_unregister_device(per_cpu_ptr(intel_idle_cpuidle_devices, i));
}
/*
* Match up the latency and break even point of the bare metal (cpu based)
* states with the deepest VM available state.
*
* We only want to do this for the deepest state, the ones that has
* the TLB_FLUSHED flag set on the .
*
* All our short idle states are dominated by vmexit/vmenter latencies,
* not the underlying hardware latencies so we keep our values for these.
*/
static void matchup_vm_state_with_baremetal(void)
{
int cstate;
for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
int matching_cstate;
if (intel_idle_max_cstate_reached(cstate))
break;
if (!cpuidle_state_table[cstate].enter)
break;
if (!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_TLB_FLUSHED))
continue;
for (matching_cstate = 0; matching_cstate < CPUIDLE_STATE_MAX; ++matching_cstate) {
if (!icpu->state_table[matching_cstate].enter)
break;
if (icpu->state_table[matching_cstate].exit_latency > cpuidle_state_table[cstate].exit_latency) {
cpuidle_state_table[cstate].exit_latency = icpu->state_table[matching_cstate].exit_latency;
cpuidle_state_table[cstate].target_residency = icpu->state_table[matching_cstate].target_residency;
}
}
}
}
static int __init intel_idle_vminit(const struct x86_cpu_id *id)
{
int retval;
cpuidle_state_table = vmguest_cstates;
icpu = (const struct idle_cpu *)id->driver_data;
pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n",
boot_cpu_data.x86_model);
intel_idle_cpuidle_devices = alloc_percpu(struct cpuidle_device);
if (!intel_idle_cpuidle_devices)
return -ENOMEM;
/*
* We don't know exactly what the host will do when we go idle, but as a worst estimate
* we can assume that the exit latency of the deepest host state will be hit for our
* deep (long duration) guest idle state.
* The same logic applies to the break even point for the long duration guest idle state.
* So lets copy these two properties from the table we found for the host CPU type.
*/
matchup_vm_state_with_baremetal();
intel_idle_cpuidle_driver_init(&intel_idle_driver);
retval = cpuidle_register_driver(&intel_idle_driver);
if (retval) {
struct cpuidle_driver *drv = cpuidle_get_driver();
printk(KERN_DEBUG pr_fmt("intel_idle yielding to %s\n"),
drv ? drv->name : "none");
goto init_driver_fail;
}
retval = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "idle/intel:online",
intel_idle_cpu_online, NULL);
if (retval < 0)
goto hp_setup_fail;
return 0;
hp_setup_fail:
intel_idle_cpuidle_devices_uninit();
cpuidle_unregister_driver(&intel_idle_driver);
init_driver_fail:
free_percpu(intel_idle_cpuidle_devices);
return retval;
}
static int __init intel_idle_init(void)
{
const struct x86_cpu_id *id;
@ -2059,6 +2242,8 @@ static int __init intel_idle_init(void)
id = x86_match_cpu(intel_idle_ids);
if (id) {
if (!boot_cpu_has(X86_FEATURE_MWAIT)) {
if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
return intel_idle_vminit(id);
pr_debug("Please enable MWAIT in BIOS SETUP\n");
return -ENODEV;
}

View File

@ -30,6 +30,7 @@
#include <linux/pinctrl/pinconf.h>
#include <linux/pinctrl/pinconf-generic.h>
#include <linux/pinctrl/pinmux.h>
#include <linux/suspend.h>
#include "core.h"
#include "pinctrl-utils.h"
@ -636,9 +637,8 @@ static bool do_amd_gpio_irq_handler(int irq, void *dev_id)
regval = readl(regs + i);
if (regval & PIN_IRQ_PENDING)
dev_dbg(&gpio_dev->pdev->dev,
"GPIO %d is active: 0x%x",
irqnr + i, regval);
pm_pr_dbg("GPIO %d is active: 0x%x",
irqnr + i, regval);
/* caused wake on resume context for shared IRQ */
if (irq < 0 && (regval & BIT(WAKE_STS_OFF)))

View File

@ -543,7 +543,7 @@ static int amd_pmc_idlemask_read(struct amd_pmc_dev *pdev, struct device *dev,
}
if (dev)
dev_dbg(pdev->dev, "SMU idlemask s0i3: 0x%x\n", val);
pm_pr_dbg("SMU idlemask s0i3: 0x%x\n", val);
if (s)
seq_printf(s, "SMU idlemask : 0x%x\n", val);
@ -769,7 +769,7 @@ static int amd_pmc_verify_czn_rtc(struct amd_pmc_dev *pdev, u32 *arg)
*arg |= (duration << 16);
rc = rtc_alarm_irq_enable(rtc_device, 0);
dev_dbg(pdev->dev, "wakeup timer programmed for %lld seconds\n", duration);
pm_pr_dbg("wakeup timer programmed for %lld seconds\n", duration);
return rc;
}

View File

@ -18,10 +18,12 @@ if POWERCAP
# Client driver configurations go here.
config INTEL_RAPL_CORE
tristate
depends on PCI
select IOSF_MBI
config INTEL_RAPL
tristate "Intel RAPL Support via MSR Interface"
depends on X86 && IOSF_MBI
depends on X86 && PCI
select INTEL_RAPL_CORE
help
This enables support for the Intel Running Average Power Limit (RAPL)
@ -33,6 +35,20 @@ config INTEL_RAPL
controller, CPU core (Power Plane 0), graphics uncore (Power Plane
1), etc.
config INTEL_RAPL_TPMI
tristate "Intel RAPL Support via TPMI Interface"
depends on X86
depends on INTEL_TPMI
select INTEL_RAPL_CORE
help
This enables support for the Intel Running Average Power Limit (RAPL)
technology via TPMI interface, which allows power limits to be enforced
and monitored.
In RAPL, the platform level settings are divided into domains for
fine grained control. These domains include processor package, DRAM
controller, platform, etc.
config IDLE_INJECT
bool "Idle injection framework"
depends on CPU_IDLE

View File

@ -5,5 +5,6 @@ obj-$(CONFIG_DTPM_DEVFREQ) += dtpm_devfreq.o
obj-$(CONFIG_POWERCAP) += powercap_sys.o
obj-$(CONFIG_INTEL_RAPL_CORE) += intel_rapl_common.o
obj-$(CONFIG_INTEL_RAPL) += intel_rapl_msr.o
obj-$(CONFIG_INTEL_RAPL_TPMI) += intel_rapl_tpmi.o
obj-$(CONFIG_IDLE_INJECT) += idle_inject.o
obj-$(CONFIG_ARM_SCMI_POWERCAP) += arm_scmi_powercap.o

File diff suppressed because it is too large Load Diff

View File

@ -22,7 +22,6 @@
#include <linux/processor.h>
#include <linux/platform_device.h>
#include <asm/iosf_mbi.h>
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
@ -34,6 +33,7 @@
static struct rapl_if_priv *rapl_msr_priv;
static struct rapl_if_priv rapl_msr_priv_intel = {
.type = RAPL_IF_MSR,
.reg_unit = MSR_RAPL_POWER_UNIT,
.regs[RAPL_DOMAIN_PACKAGE] = {
MSR_PKG_POWER_LIMIT, MSR_PKG_ENERGY_STATUS, MSR_PKG_PERF_STATUS, 0, MSR_PKG_POWER_INFO },
@ -45,11 +45,12 @@ static struct rapl_if_priv rapl_msr_priv_intel = {
MSR_DRAM_POWER_LIMIT, MSR_DRAM_ENERGY_STATUS, MSR_DRAM_PERF_STATUS, 0, MSR_DRAM_POWER_INFO },
.regs[RAPL_DOMAIN_PLATFORM] = {
MSR_PLATFORM_POWER_LIMIT, MSR_PLATFORM_ENERGY_STATUS, 0, 0, 0},
.limits[RAPL_DOMAIN_PACKAGE] = 2,
.limits[RAPL_DOMAIN_PLATFORM] = 2,
.limits[RAPL_DOMAIN_PACKAGE] = BIT(POWER_LIMIT2),
.limits[RAPL_DOMAIN_PLATFORM] = BIT(POWER_LIMIT2),
};
static struct rapl_if_priv rapl_msr_priv_amd = {
.type = RAPL_IF_MSR,
.reg_unit = MSR_AMD_RAPL_POWER_UNIT,
.regs[RAPL_DOMAIN_PACKAGE] = {
0, MSR_AMD_PKG_ENERGY_STATUS, 0, 0, 0 },
@ -68,9 +69,9 @@ static int rapl_cpu_online(unsigned int cpu)
{
struct rapl_package *rp;
rp = rapl_find_package_domain(cpu, rapl_msr_priv);
rp = rapl_find_package_domain(cpu, rapl_msr_priv, true);
if (!rp) {
rp = rapl_add_package(cpu, rapl_msr_priv);
rp = rapl_add_package(cpu, rapl_msr_priv, true);
if (IS_ERR(rp))
return PTR_ERR(rp);
}
@ -83,7 +84,7 @@ static int rapl_cpu_down_prep(unsigned int cpu)
struct rapl_package *rp;
int lead_cpu;
rp = rapl_find_package_domain(cpu, rapl_msr_priv);
rp = rapl_find_package_domain(cpu, rapl_msr_priv, true);
if (!rp)
return 0;
@ -137,14 +138,14 @@ static int rapl_msr_write_raw(int cpu, struct reg_action *ra)
/* List of verified CPUs. */
static const struct x86_cpu_id pl4_support_ids[] = {
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_TIGERLAKE_L, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE_L, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ALDERLAKE_N, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_RAPTORLAKE, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_RAPTORLAKE_P, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_METEORLAKE, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_METEORLAKE_L, X86_FEATURE_ANY },
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, NULL),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, NULL),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, NULL),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, NULL),
X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, NULL),
X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, NULL),
X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE, NULL),
X86_MATCH_INTEL_FAM6_MODEL(METEORLAKE_L, NULL),
{}
};
@ -169,7 +170,7 @@ static int rapl_msr_probe(struct platform_device *pdev)
rapl_msr_priv->write_raw = rapl_msr_write_raw;
if (id) {
rapl_msr_priv->limits[RAPL_DOMAIN_PACKAGE] = 3;
rapl_msr_priv->limits[RAPL_DOMAIN_PACKAGE] |= BIT(POWER_LIMIT4);
rapl_msr_priv->regs[RAPL_DOMAIN_PACKAGE][RAPL_DOMAIN_REG_PL4] =
MSR_VR_CURRENT_CONFIG;
pr_info("PL4 support detected.\n");

View File

@ -0,0 +1,325 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* intel_rapl_tpmi: Intel RAPL driver via TPMI interface
*
* Copyright (c) 2023, Intel Corporation.
* All Rights Reserved.
*
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/auxiliary_bus.h>
#include <linux/io.h>
#include <linux/intel_tpmi.h>
#include <linux/intel_rapl.h>
#include <linux/module.h>
#include <linux/slab.h>
#define TPMI_RAPL_VERSION 1
/* 1 header + 10 registers + 5 reserved. 8 bytes for each. */
#define TPMI_RAPL_DOMAIN_SIZE 128
enum tpmi_rapl_domain_type {
TPMI_RAPL_DOMAIN_INVALID,
TPMI_RAPL_DOMAIN_SYSTEM,
TPMI_RAPL_DOMAIN_PACKAGE,
TPMI_RAPL_DOMAIN_RESERVED,
TPMI_RAPL_DOMAIN_MEMORY,
TPMI_RAPL_DOMAIN_MAX,
};
enum tpmi_rapl_register {
TPMI_RAPL_REG_HEADER,
TPMI_RAPL_REG_UNIT,
TPMI_RAPL_REG_PL1,
TPMI_RAPL_REG_PL2,
TPMI_RAPL_REG_PL3,
TPMI_RAPL_REG_PL4,
TPMI_RAPL_REG_RESERVED,
TPMI_RAPL_REG_ENERGY_STATUS,
TPMI_RAPL_REG_PERF_STATUS,
TPMI_RAPL_REG_POWER_INFO,
TPMI_RAPL_REG_INTERRUPT,
TPMI_RAPL_REG_MAX = 15,
};
struct tpmi_rapl_package {
struct rapl_if_priv priv;
struct intel_tpmi_plat_info *tpmi_info;
struct rapl_package *rp;
void __iomem *base;
struct list_head node;
};
static LIST_HEAD(tpmi_rapl_packages);
static DEFINE_MUTEX(tpmi_rapl_lock);
static struct powercap_control_type *tpmi_control_type;
static int tpmi_rapl_read_raw(int id, struct reg_action *ra)
{
if (!ra->reg)
return -EINVAL;
ra->value = readq((void __iomem *)ra->reg);
ra->value &= ra->mask;
return 0;
}
static int tpmi_rapl_write_raw(int id, struct reg_action *ra)
{
u64 val;
if (!ra->reg)
return -EINVAL;
val = readq((void __iomem *)ra->reg);
val &= ~ra->mask;
val |= ra->value;
writeq(val, (void __iomem *)ra->reg);
return 0;
}
static struct tpmi_rapl_package *trp_alloc(int pkg_id)
{
struct tpmi_rapl_package *trp;
int ret;
mutex_lock(&tpmi_rapl_lock);
if (list_empty(&tpmi_rapl_packages)) {
tpmi_control_type = powercap_register_control_type(NULL, "intel-rapl", NULL);
if (IS_ERR(tpmi_control_type)) {
ret = PTR_ERR(tpmi_control_type);
goto err_unlock;
}
}
trp = kzalloc(sizeof(*trp), GFP_KERNEL);
if (!trp) {
ret = -ENOMEM;
goto err_del_powercap;
}
list_add(&trp->node, &tpmi_rapl_packages);
mutex_unlock(&tpmi_rapl_lock);
return trp;
err_del_powercap:
if (list_empty(&tpmi_rapl_packages))
powercap_unregister_control_type(tpmi_control_type);
err_unlock:
mutex_unlock(&tpmi_rapl_lock);
return ERR_PTR(ret);
}
static void trp_release(struct tpmi_rapl_package *trp)
{
mutex_lock(&tpmi_rapl_lock);
list_del(&trp->node);
if (list_empty(&tpmi_rapl_packages))
powercap_unregister_control_type(tpmi_control_type);
kfree(trp);
mutex_unlock(&tpmi_rapl_lock);
}
static int parse_one_domain(struct tpmi_rapl_package *trp, u32 offset)
{
u8 tpmi_domain_version;
enum rapl_domain_type domain_type;
enum tpmi_rapl_domain_type tpmi_domain_type;
enum tpmi_rapl_register reg_index;
enum rapl_domain_reg_id reg_id;
int tpmi_domain_size, tpmi_domain_flags;
u64 *tpmi_rapl_regs = trp->base + offset;
u64 tpmi_domain_header = readq((void __iomem *)tpmi_rapl_regs);
/* Domain Parent bits are ignored for now */
tpmi_domain_version = tpmi_domain_header & 0xff;
tpmi_domain_type = tpmi_domain_header >> 8 & 0xff;
tpmi_domain_size = tpmi_domain_header >> 16 & 0xff;
tpmi_domain_flags = tpmi_domain_header >> 32 & 0xffff;
if (tpmi_domain_version != TPMI_RAPL_VERSION) {
pr_warn(FW_BUG "Unsupported version:%d\n", tpmi_domain_version);
return -ENODEV;
}
/* Domain size: in unit of 128 Bytes */
if (tpmi_domain_size != 1) {
pr_warn(FW_BUG "Invalid Domain size %d\n", tpmi_domain_size);
return -EINVAL;
}
/* Unit register and Energy Status register are mandatory for each domain */
if (!(tpmi_domain_flags & BIT(TPMI_RAPL_REG_UNIT)) ||
!(tpmi_domain_flags & BIT(TPMI_RAPL_REG_ENERGY_STATUS))) {
pr_warn(FW_BUG "Invalid Domain flag 0x%x\n", tpmi_domain_flags);
return -EINVAL;
}
switch (tpmi_domain_type) {
case TPMI_RAPL_DOMAIN_PACKAGE:
domain_type = RAPL_DOMAIN_PACKAGE;
break;
case TPMI_RAPL_DOMAIN_SYSTEM:
domain_type = RAPL_DOMAIN_PLATFORM;
break;
case TPMI_RAPL_DOMAIN_MEMORY:
domain_type = RAPL_DOMAIN_DRAM;
break;
default:
pr_warn(FW_BUG "Unsupported Domain type %d\n", tpmi_domain_type);
return -EINVAL;
}
if (trp->priv.regs[domain_type][RAPL_DOMAIN_REG_UNIT]) {
pr_warn(FW_BUG "Duplicate Domain type %d\n", tpmi_domain_type);
return -EINVAL;
}
reg_index = TPMI_RAPL_REG_HEADER;
while (++reg_index != TPMI_RAPL_REG_MAX) {
if (!(tpmi_domain_flags & BIT(reg_index)))
continue;
switch (reg_index) {
case TPMI_RAPL_REG_UNIT:
reg_id = RAPL_DOMAIN_REG_UNIT;
break;
case TPMI_RAPL_REG_PL1:
reg_id = RAPL_DOMAIN_REG_LIMIT;
trp->priv.limits[domain_type] |= BIT(POWER_LIMIT1);
break;
case TPMI_RAPL_REG_PL2:
reg_id = RAPL_DOMAIN_REG_PL2;
trp->priv.limits[domain_type] |= BIT(POWER_LIMIT2);
break;
case TPMI_RAPL_REG_PL4:
reg_id = RAPL_DOMAIN_REG_PL4;
trp->priv.limits[domain_type] |= BIT(POWER_LIMIT4);
break;
case TPMI_RAPL_REG_ENERGY_STATUS:
reg_id = RAPL_DOMAIN_REG_STATUS;
break;
case TPMI_RAPL_REG_PERF_STATUS:
reg_id = RAPL_DOMAIN_REG_PERF;
break;
case TPMI_RAPL_REG_POWER_INFO:
reg_id = RAPL_DOMAIN_REG_INFO;
break;
default:
continue;
}
trp->priv.regs[domain_type][reg_id] = (u64)&tpmi_rapl_regs[reg_index];
}
return 0;
}
static int intel_rapl_tpmi_probe(struct auxiliary_device *auxdev,
const struct auxiliary_device_id *id)
{
struct tpmi_rapl_package *trp;
struct intel_tpmi_plat_info *info;
struct resource *res;
u32 offset;
int ret;
info = tpmi_get_platform_data(auxdev);
if (!info)
return -ENODEV;
trp = trp_alloc(info->package_id);
if (IS_ERR(trp))
return PTR_ERR(trp);
if (tpmi_get_resource_count(auxdev) > 1) {
dev_err(&auxdev->dev, "does not support multiple resources\n");
ret = -EINVAL;
goto err;
}
res = tpmi_get_resource_at_index(auxdev, 0);
if (!res) {
dev_err(&auxdev->dev, "can't fetch device resource info\n");
ret = -EIO;
goto err;
}
trp->base = devm_ioremap_resource(&auxdev->dev, res);
if (IS_ERR(trp->base)) {
ret = PTR_ERR(trp->base);
goto err;
}
for (offset = 0; offset < resource_size(res); offset += TPMI_RAPL_DOMAIN_SIZE) {
ret = parse_one_domain(trp, offset);
if (ret)
goto err;
}
trp->tpmi_info = info;
trp->priv.type = RAPL_IF_TPMI;
trp->priv.read_raw = tpmi_rapl_read_raw;
trp->priv.write_raw = tpmi_rapl_write_raw;
trp->priv.control_type = tpmi_control_type;
/* RAPL TPMI I/F is per physical package */
trp->rp = rapl_find_package_domain(info->package_id, &trp->priv, false);
if (trp->rp) {
dev_err(&auxdev->dev, "Domain for Package%d already exists\n", info->package_id);
ret = -EEXIST;
goto err;
}
trp->rp = rapl_add_package(info->package_id, &trp->priv, false);
if (IS_ERR(trp->rp)) {
dev_err(&auxdev->dev, "Failed to add RAPL Domain for Package%d, %ld\n",
info->package_id, PTR_ERR(trp->rp));
ret = PTR_ERR(trp->rp);
goto err;
}
auxiliary_set_drvdata(auxdev, trp);
return 0;
err:
trp_release(trp);
return ret;
}
static void intel_rapl_tpmi_remove(struct auxiliary_device *auxdev)
{
struct tpmi_rapl_package *trp = auxiliary_get_drvdata(auxdev);
rapl_remove_package(trp->rp);
trp_release(trp);
}
static const struct auxiliary_device_id intel_rapl_tpmi_ids[] = {
{.name = "intel_vsec.tpmi-rapl" },
{ }
};
MODULE_DEVICE_TABLE(auxiliary, intel_rapl_tpmi_ids);
static struct auxiliary_driver intel_rapl_tpmi_driver = {
.probe = intel_rapl_tpmi_probe,
.remove = intel_rapl_tpmi_remove,
.id_table = intel_rapl_tpmi_ids,
};
module_auxiliary_driver(intel_rapl_tpmi_driver)
MODULE_IMPORT_NS(INTEL_TPMI);
MODULE_DESCRIPTION("Intel RAPL TPMI Driver");
MODULE_LICENSE("GPL");

View File

@ -15,8 +15,8 @@ static const struct rapl_mmio_regs rapl_mmio_default = {
.reg_unit = 0x5938,
.regs[RAPL_DOMAIN_PACKAGE] = { 0x59a0, 0x593c, 0x58f0, 0, 0x5930},
.regs[RAPL_DOMAIN_DRAM] = { 0x58e0, 0x58e8, 0x58ec, 0, 0},
.limits[RAPL_DOMAIN_PACKAGE] = 2,
.limits[RAPL_DOMAIN_DRAM] = 2,
.limits[RAPL_DOMAIN_PACKAGE] = BIT(POWER_LIMIT2),
.limits[RAPL_DOMAIN_DRAM] = BIT(POWER_LIMIT2),
};
static int rapl_mmio_cpu_online(unsigned int cpu)
@ -27,9 +27,9 @@ static int rapl_mmio_cpu_online(unsigned int cpu)
if (topology_physical_package_id(cpu))
return 0;
rp = rapl_find_package_domain(cpu, &rapl_mmio_priv);
rp = rapl_find_package_domain(cpu, &rapl_mmio_priv, true);
if (!rp) {
rp = rapl_add_package(cpu, &rapl_mmio_priv);
rp = rapl_add_package(cpu, &rapl_mmio_priv, true);
if (IS_ERR(rp))
return PTR_ERR(rp);
}
@ -42,7 +42,7 @@ static int rapl_mmio_cpu_down_prep(unsigned int cpu)
struct rapl_package *rp;
int lead_cpu;
rp = rapl_find_package_domain(cpu, &rapl_mmio_priv);
rp = rapl_find_package_domain(cpu, &rapl_mmio_priv, true);
if (!rp)
return 0;
@ -97,6 +97,7 @@ int proc_thermal_rapl_add(struct pci_dev *pdev, struct proc_thermal_device *proc
rapl_regs->regs[domain][reg];
rapl_mmio_priv.limits[domain] = rapl_regs->limits[domain];
}
rapl_mmio_priv.type = RAPL_IF_MMIO;
rapl_mmio_priv.reg_unit = (u64)proc_priv->mmio_base + rapl_regs->reg_unit;
rapl_mmio_priv.read_raw = rapl_mmio_read_raw;

View File

@ -307,7 +307,8 @@ enum acpi_preferred_pm_profiles {
PM_SOHO_SERVER = 5,
PM_APPLIANCE_PC = 6,
PM_PERFORMANCE_SERVER = 7,
PM_TABLET = 8
PM_TABLET = 8,
NR_PM_PROFILES = 9
};
/* Values for sleep_status and sleep_control registers (V5+ FADT) */

View File

@ -94,7 +94,8 @@ struct amd_cpudata {
* enum amd_pstate_mode - driver working mode of amd pstate
*/
enum amd_pstate_mode {
AMD_PSTATE_DISABLE = 0,
AMD_PSTATE_UNDEFINED = 0,
AMD_PSTATE_DISABLE,
AMD_PSTATE_PASSIVE,
AMD_PSTATE_ACTIVE,
AMD_PSTATE_GUIDED,
@ -102,6 +103,7 @@ enum amd_pstate_mode {
};
static const char * const amd_pstate_mode_string[] = {
[AMD_PSTATE_UNDEFINED] = "undefined",
[AMD_PSTATE_DISABLE] = "disable",
[AMD_PSTATE_PASSIVE] = "passive",
[AMD_PSTATE_ACTIVE] = "active",

View File

@ -340,7 +340,10 @@ struct cpufreq_driver {
/*
* ->fast_switch() replacement for drivers that use an internal
* representation of performance levels and can pass hints other than
* the target performance level to the hardware.
* the target performance level to the hardware. This can only be set
* if ->fast_switch is set too, because in those cases (under specific
* conditions) scale invariance can be disabled, which causes the
* schedutil governor to fall back to the latter.
*/
void (*adjust_perf)(unsigned int cpu,
unsigned long min_perf,

View File

@ -108,7 +108,6 @@ struct devfreq_dev_profile {
unsigned long initial_freq;
unsigned int polling_ms;
enum devfreq_timer timer;
bool is_cooling_device;
int (*target)(struct device *dev, unsigned long *freq, u32 flags);
int (*get_dev_status)(struct device *dev,
@ -118,6 +117,8 @@ struct devfreq_dev_profile {
unsigned long *freq_table;
unsigned int max_state;
bool is_cooling_device;
};
/**

View File

@ -14,6 +14,12 @@
#include <linux/powercap.h>
#include <linux/cpuhotplug.h>
enum rapl_if_type {
RAPL_IF_MSR, /* RAPL I/F using MSR registers */
RAPL_IF_MMIO, /* RAPL I/F using MMIO registers */
RAPL_IF_TPMI, /* RAPL I/F using TPMI registers */
};
enum rapl_domain_type {
RAPL_DOMAIN_PACKAGE, /* entire package/socket */
RAPL_DOMAIN_PP0, /* core power plane */
@ -30,17 +36,23 @@ enum rapl_domain_reg_id {
RAPL_DOMAIN_REG_POLICY,
RAPL_DOMAIN_REG_INFO,
RAPL_DOMAIN_REG_PL4,
RAPL_DOMAIN_REG_UNIT,
RAPL_DOMAIN_REG_PL2,
RAPL_DOMAIN_REG_MAX,
};
struct rapl_domain;
enum rapl_primitives {
ENERGY_COUNTER,
POWER_LIMIT1,
POWER_LIMIT2,
POWER_LIMIT4,
ENERGY_COUNTER,
FW_LOCK,
FW_HIGH_LOCK,
PL1_LOCK,
PL2_LOCK,
PL4_LOCK,
PL1_ENABLE, /* power limit 1, aka long term */
PL1_CLAMP, /* allow frequency to go below OS request */
@ -74,12 +86,13 @@ struct rapl_domain_data {
unsigned long timestamp;
};
#define NR_POWER_LIMITS (3)
#define NR_POWER_LIMITS (POWER_LIMIT4 + 1)
struct rapl_power_limit {
struct powercap_zone_constraint *constraint;
int prim_id; /* primitive ID used to enable */
struct rapl_domain *domain;
const char *name;
bool locked;
u64 last_power_limit;
};
@ -96,7 +109,9 @@ struct rapl_domain {
struct rapl_power_limit rpl[NR_POWER_LIMITS];
u64 attr_map; /* track capabilities */
unsigned int state;
unsigned int domain_energy_unit;
unsigned int power_unit;
unsigned int energy_unit;
unsigned int time_unit;
struct rapl_package *rp;
};
@ -121,16 +136,20 @@ struct reg_action {
* registers.
* @write_raw: Callback for writing RAPL interface specific
* registers.
* @defaults: internal pointer to interface default settings
* @rpi: internal pointer to interface primitive info
*/
struct rapl_if_priv {
enum rapl_if_type type;
struct powercap_control_type *control_type;
struct rapl_domain *platform_rapl_domain;
enum cpuhp_state pcap_rapl_online;
u64 reg_unit;
u64 regs[RAPL_DOMAIN_MAX][RAPL_DOMAIN_REG_MAX];
int limits[RAPL_DOMAIN_MAX];
int (*read_raw)(int cpu, struct reg_action *ra);
int (*write_raw)(int cpu, struct reg_action *ra);
int (*read_raw)(int id, struct reg_action *ra);
int (*write_raw)(int id, struct reg_action *ra);
void *defaults;
void *rpi;
};
/* maximum rapl package domain name: package-%d-die-%d */
@ -140,9 +159,6 @@ struct rapl_package {
unsigned int id; /* logical die id, equals physical 1-die systems */
unsigned int nr_domains;
unsigned long domain_map; /* bit map of active domains */
unsigned int power_unit;
unsigned int energy_unit;
unsigned int time_unit;
struct rapl_domain *domains; /* array of domains, sized at runtime */
struct powercap_zone *power_zone; /* keep track of parent zone */
unsigned long power_limit_irq; /* keep track of package power limit
@ -156,8 +172,8 @@ struct rapl_package {
struct rapl_if_priv *priv;
};
struct rapl_package *rapl_find_package_domain(int cpu, struct rapl_if_priv *priv);
struct rapl_package *rapl_add_package(int cpu, struct rapl_if_priv *priv);
struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv, bool id_is_cpu);
struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu);
void rapl_remove_package(struct rapl_package *rp);
#endif /* __INTEL_RAPL_H__ */

View File

@ -202,6 +202,7 @@ struct platform_s2idle_ops {
};
#ifdef CONFIG_SUSPEND
extern suspend_state_t pm_suspend_target_state;
extern suspend_state_t mem_sleep_current;
extern suspend_state_t mem_sleep_default;
@ -337,6 +338,8 @@ extern bool sync_on_suspend_enabled;
#else /* !CONFIG_SUSPEND */
#define suspend_valid_only_mem NULL
#define pm_suspend_target_state (PM_SUSPEND_ON)
static inline void pm_suspend_clear_flags(void) {}
static inline void pm_set_suspend_via_firmware(void) {}
static inline void pm_set_resume_via_firmware(void) {}
@ -472,6 +475,8 @@ static inline int hibernate_quiet_exec(int (*func)(void *data), void *data) {
}
#endif /* CONFIG_HIBERNATION */
int arch_resume_nosmt(void);
#ifdef CONFIG_HIBERNATION_SNAPSHOT_DEV
int is_hibernate_resume_dev(dev_t dev);
#else
@ -507,7 +512,6 @@ extern void pm_report_max_hw_sleep(u64 t);
/* drivers/base/power/wakeup.c */
extern bool events_check_enabled;
extern suspend_state_t pm_suspend_target_state;
extern bool pm_wakeup_pending(void);
extern void pm_system_wakeup(void);
@ -555,6 +559,7 @@ static inline void unlock_system_sleep(unsigned int flags) {}
#ifdef CONFIG_PM_SLEEP_DEBUG
extern bool pm_print_times_enabled;
extern bool pm_debug_messages_on;
extern bool pm_debug_messages_should_print(void);
static inline int pm_dyn_debug_messages_on(void)
{
#ifdef CONFIG_DYNAMIC_DEBUG
@ -568,14 +573,14 @@ static inline int pm_dyn_debug_messages_on(void)
#endif
#define __pm_pr_dbg(fmt, ...) \
do { \
if (pm_debug_messages_on) \
if (pm_debug_messages_should_print()) \
printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
else if (pm_dyn_debug_messages_on()) \
pr_debug(fmt, ##__VA_ARGS__); \
} while (0)
#define __pm_deferred_pr_dbg(fmt, ...) \
do { \
if (pm_debug_messages_on) \
if (pm_debug_messages_should_print()) \
printk_deferred(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
} while (0)
#else
@ -593,7 +598,8 @@ static inline int pm_dyn_debug_messages_on(void)
/**
* pm_pr_dbg - print pm sleep debug messages
*
* If pm_debug_messages_on is enabled, print message.
* If pm_debug_messages_on is enabled and the system is entering/leaving
* suspend, print message.
* If pm_debug_messages_on is disabled and CONFIG_DYNAMIC_DEBUG is enabled,
* print message only from instances explicitly enabled on dynamic debug's
* control.

View File

@ -556,6 +556,12 @@ power_attr_ro(pm_wakeup_irq);
bool pm_debug_messages_on __read_mostly;
bool pm_debug_messages_should_print(void)
{
return pm_debug_messages_on && pm_suspend_target_state != PM_SUSPEND_ON;
}
EXPORT_SYMBOL_GPL(pm_debug_messages_should_print);
static ssize_t pm_debug_messages_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{

View File

@ -398,7 +398,7 @@ struct mem_zone_bm_rtree {
unsigned int blocks; /* Number of Bitmap Blocks */
};
/* strcut bm_position is used for browsing memory bitmaps */
/* struct bm_position is used for browsing memory bitmaps */
struct bm_position {
struct mem_zone_bm_rtree *zone;