Commit Graph

93 Commits

Author SHA1 Message Date
Viresh Kumar
44152cb82d cpufreq: governor: Keep single copy of information common to policy->cpus
Some information is common to all CPUs belonging to a policy, but are
kept on per-cpu basis. Lets keep that in another structure common to all
policy->cpus. That will make updates/reads to that less complex and less
error prone.

The memory for cpu_common_dbs_info is allocated/freed at INIT/EXIT, so
that it we don't reallocate it for STOP/START sequence. It will be also
be used (in next patch) while the governor is stopped and so must not be
freed that early.

Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-21 01:12:01 +02:00
Viresh Kumar
42994af63c cpufreq: governor: rename cur_policy as policy
Just call it 'policy', cur_policy is unnecessarily long and doesn't
have any special meaning.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-17 23:46:48 +02:00
Viresh Kumar
386d46e6d5 cpufreq: governor: Name delayed-work as dwork
Delayed work was named as 'work' and to access work within it we do
work.work. Not much readable. Rename delayed_work as 'dwork'.

Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-07-17 23:46:47 +02:00
Viresh Kumar
732b6d617a cpufreq: governor: Serialize governor callbacks
There are several races reported in cpufreq core around governors (only
ondemand and conservative) by different people.

There are at least two race scenarios present in governor code:
 (a) Concurrent access/updates of governor internal structures.

 It is possible that fields such as 'dbs_data->usage_count', etc.  are
 accessed simultaneously for different policies using same governor
 structure (i.e. CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag unset). And
 because of this we can dereference bad pointers.

 For example consider a system with two CPUs with separate 'struct
 cpufreq_policy' instances. CPU0 governor: ondemand and CPU1: powersave.
 CPU0 switching to powersave and CPU1 to ondemand:
	CPU0				CPU1

	store*				store*

	cpufreq_governor_exit()		cpufreq_governor_init()
					dbs_data = cdata->gdbs_data;

	if (!--dbs_data->usage_count)
		kfree(dbs_data);

					dbs_data->usage_count++;
					*Bad pointer dereference*

 There are other races possible between EXIT and START/STOP/LIMIT as
 well. Its really complicated.

 (b) Switching governor state in bad sequence:

 For example trying to switch a governor to START state, when the
 governor is in EXIT state. There are some checks present in
 __cpufreq_governor() but they aren't sufficient as they compare events
 against 'policy->governor_enabled', where as we need to take governor's
 state into account, which can be used by multiple policies.

These two issues need to be solved separately and the responsibility
should be properly divided between cpufreq and governor core.

The first problem is more about the governor core, as it needs to
protect its structures properly. And the second problem should be fixed
in cpufreq core instead of governor, as its all about sequence of
events.

This patch is trying to solve only the first problem.

There are two types of data we need to protect,
- 'struct common_dbs_data': No matter what, there is going to be a
  single copy of this per governor.
- 'struct dbs_data': With CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag set, we
  will have per-policy copy of this data, otherwise a single copy.

Because of such complexities, the mutex present in 'struct dbs_data' is
insufficient to solve our problem. For example we need to protect
fetching of 'dbs_data' from different structures at the beginning of
cpufreq_governor_dbs(), to make sure it isn't currently being updated.

This can be fixed if we can guarantee serialization of event parsing
code for an individual governor. This is best solved with a mutex per
governor, and the placeholder for that is 'struct common_dbs_data'.

And so this patch moves the mutex from 'struct dbs_data' to 'struct
common_dbs_data' and takes it at the beginning and drops it at the end
of cpufreq_governor_dbs().

Tested with and without following configuration options:

CONFIG_LOCKDEP_SUPPORT=y
CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_PI_LIST=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-15 15:42:53 +02:00
Viresh Kumar
8e0484d2b3 cpufreq: governor: register notifier from cs_init()
Notifiers are required only for conservative governor and the common
governor code is unnecessarily polluted with that. Handle that from
cs_init/exit() instead of cpufreq_governor_dbs().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-15 15:37:12 +02:00
Xiaoguang Chen
6d7bcb1464 cpufreq: conservative: set requested_freq to policy max when it is over policy max
When requested_freq is over policy->max, set it to policy->max.
This can help to speed up decreasing frequency.

Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-12 23:18:20 +01:00
Xiaoguang Chen
3baa976ae6 cpufreq: conservative: fix requested_freq reduction issue
When decreasing frequency, requested_freq may be less than
freq_target, So requested_freq minus freq_target may be negative,
But reqested_freq's unit is unsigned int, then the negative result
will be one larger interger which may be even higher than
requested_freq.

This patch is to fix such issue. when result becomes negative,
set requested_freq as the min value of policy.

Signed-off-by: Xiaoguang Chen <chenxg@marvell.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-07 19:36:19 +01:00
Stratos Karafotis
934dac1ea0 cpufreq: governors: Remove duplicate check of target freq in supported range
Function __cpufreq_driver_target() checks if target_freq is within
policy->min and policy->max range. generic_powersave_bias_target() also
checks if target_freq is valid via a cpufreq_frequency_table_target()
call. So, drop the unnecessary duplicate check in *_check_cpu().

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-28 22:03:02 +02:00
Rafael J. Wysocki
c49a089c3e Merge back earlier 'pm-cpufreq' material 2013-08-14 22:21:16 +02:00
Viresh Kumar
d5b73cd870 cpufreq: Use sizeof(*ptr) convetion for computing sizes
Chapter 14 of Documentation/CodingStyle says:

The preferred form for passing a size of a struct is the following:

	p = kmalloc(sizeof(*p), ...);

The alternative form where struct name is spelled out hurts
readability and introduces an opportunity for a bug when the pointer
variable type is changed but the corresponding sizeof that is passed
to a memory allocator is not.

This wasn't followed consistently in drivers/cpufreq, let's make it
more consistent by always following this rule.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-07 23:34:10 +02:00
Viresh Kumar
5ff0a26803 cpufreq: Clean up header files included in the core
This patch addresses the following issues in the header files in the
cpufreq core:
 - Include headers in ascending order, so that we don't add same
   many times by mistake.
 - <asm/> must be included after <linux/>, so that they override
   whatever they need to.
 - Remove unnecessary includes.
 - Don't include files already included by cpufreq.h or
   cpufreq_governor.h.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-07 23:34:09 +02:00
Viresh Kumar
6c4640c3ad cpufreq: rename ignore_nice as ignore_nice_load
This sysfs file was called ignore_nice_load earlier and commit
4d5dcc4 (cpufreq: governor: Implement per policy instances of
governors) changed its name to ignore_nice by mistake.

Lets get it renamed back to its original name.

Reported-by: Martin von Gagern <Martin.vGagern@gmx.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-08-07 22:25:06 +02:00
Stratos Karafotis
9876584772 cpufreq: conservative: Use an inline function to evaluate freq_target
Use an inline function to evaluate freq_target to avoid duplicate code.

Also, define a macro for the default frequency step.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:36 +02:00
Stratos Karafotis
27ed3cd2eb cpufreq: conservative: Fix the logic in frequency decrease checking
When we evaluate the CPU load for frequency decrease we have to compare
the load against down_threshold. There is no need to subtract 10 points
from down_threshold.

Instead, we have to use the default down_threshold or user's selection
unmodified.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:35 +02:00
Stratos Karafotis
7af1c0568d cpufreq: conservative: Fix sampling_down_factor functionality
sampling_down_factor tunable is unused since commit
8e677ce83b (4 years ago).

This patch restores the original functionality and documents the
tunable.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:35 +02:00
Stratos Karafotis
9366d84052 cpufreq: governors: Calculate iowait time only when necessary
Currently we always calculate the CPU iowait time and add it to idle time.
If we are in ondemand and we use io_is_busy, we re-calculate iowait time
and we subtract it from idle time.

With this patch iowait time is calculated only when necessary avoiding
the double call to get_cpu_iowait_time_us. We use a parameter in
function get_cpu_idle_time to distinguish when the iowait time will be
added to idle time or not, without the need of keeping the prev_io_wait.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.,org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:35 +02:00
Namhyung Kim
fed573d5c9 cpufreq: conservative: Fix relation when decreasing frequency
The relation should be CPUFREQ_RELATION_L to find optimal frequency
when decreasing.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:35 +02:00
Namhyung Kim
ad529a9cd2 cpufreq: conservative: Break out earlier on the lowest frequency
If we're on the lowest frequency, no need to calculate new freq.
Break out even earlier in this case.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:35 +02:00
Viresh Kumar
031299b3be cpufreq: governors: Avoid unnecessary per cpu timer interrupts
Following patch has introduced per cpu timers or works for ondemand and
conservative governors.

	commit 2abfa876f1
	Author: Rickard Andersson <rickard.andersson@stericsson.com>
	Date:   Thu Dec 27 14:55:38 2012 +0000

	    cpufreq: handle SW coordinated CPUs

This causes additional unnecessary interrupts on all cpus when the load is
recently evaluated by any other cpu. i.e. When load is recently evaluated by cpu
x, we don't really need any other cpu to evaluate this load again for the next
sampling_rate time.

Some sort of code is present to avoid that but we are still getting timer
interrupts for all cpus. A good way of avoiding this would be to modify delays
for all cpus (policy->cpus) whenever any cpu has evaluated load.

This patch does this change and some related code cleanup.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:35 +02:00
Viresh Kumar
4d5dcc4211 cpufreq: governor: Implement per policy instances of governors
Currently, there can't be multiple instances of single governor_type.
If we have a multi-package system, where we have multiple instances
of struct policy (per package), we can't have multiple instances of
same governor. i.e. We can't have multiple instances of ondemand
governor for multiple packages.

Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one
instance of a governor_type in the system.

This is a bottleneck for multicluster system, where we want different
packages to use same governor type, but with different tunables.

This patch uses the infrastructure provided by earlier patch and
implements init/exit routines for ondemand and conservative
governors.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01 01:11:34 +02:00
Stratos Karafotis
c88883cd54 cpufreq: conservative: Fix typos in comments
Fix a couple of typos in comments.

Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 12:56:19 +01:00
Viresh Kumar
4447266b84 cpufreq: governors: Remove code redundancy between governors
With the inclusion of following patches:

9f4eb10 cpufreq: conservative: call dbs_check_cpu only when necessary
772b4b1 cpufreq: ondemand: call dbs_check_cpu only when necessary

code redundancy between the conservative and ondemand governors is
introduced again, so get rid of it.

[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 01:02:44 +01:00
Fabio Baltieri
09dca5ae75 cpufreq: governors: fix misuse of cdbs.cpu
Fix governors code to set all cpu's cdbs->cpu to the the actual cpu id
and use cur_policy->cpu istead of cdbs->cpu to track current governor's
leader cpu.

Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:16 +01:00
Fabio Baltieri
2624f90c16 cpufreq: governors: implement generic policy_is_shared
Implement a generic helper function policy_is_shared() to replace the
current dbs_sw_coordinated_cpus() at cpufreq level, so that it can be
used by code other than cpufreq governors.

Suggested-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:16 +01:00
Fabio Baltieri
66df2a01df cpufreq: conservative: call dbs_check_cpu only when necessary
Modify conservative timer to not resample CPU utilization if recently
sampled from another SW coordinated core.

Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:13 +01:00
Rickard Andersson
2abfa876f1 cpufreq: handle SW coordinated CPUs
This patch fixes a bug that occurred when we had load on a secondary CPU
and the primary CPU was sleeping. Only one sampling timer was spawned
and it was spawned as a deferred timer on the primary CPU, so when a
secondary CPU had a change in load this was not detected by the cpufreq
governor (both ondemand and conservative).

This patch make sure that deferred timers are run on all CPUs in the
case of software controlled CPUs that run on the same frequency.

Signed-off-by: Rickard Andersson <rickard.andersson@stericsson.com>
Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-02 00:01:13 +01:00
Viresh Kumar
4471a34f9a cpufreq: governors: remove redundant code
Initially ondemand governor was written and then using its code conservative
governor is written. It used a lot of code from ondemand governor, but copy of
code was created instead of using the same routines from both governors. Which
increased code redundancy, which is difficult to manage.

This patch is an attempt to move common part of both the governors to
cpufreq_governor.c file to come over above mentioned issues.

This shouldn't change anything from functionality point of view.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:07 +01:00
viresh kumar
2aacdfff9c cpufreq: Move common part from governors to separate file, v2
Multiple cpufreq governers have defined similar get_cpu_idle_time_***()
routines. These routines must be moved to some common place, so that all
governors can use them.

So moving them to cpufreq_governor.c, which seems to be a better place for
keeping these routines.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-11-15 00:33:06 +01:00
Linus Torvalds
16642a2e7b Power management updates for 3.7-rc1
* Improved system suspend/resume and runtime PM handling for the SH TMU, CMT
   and MTU2 clock event devices (also used by ARM/shmobile).
 
 * Generic PM domains framework extensions related to cpuidle support and
   domain objects lookup using names.
 
 * ARM/shmobile power management updates including improved support for the
   SH7372's A4S power domain containing the CPU core.
 
 * cpufreq changes related to AMD CPUs support from Matthew Garrett, Andre
   Przywara and Borislav Petkov.
 
 * cpu0 cpufreq driver from Shawn Guo.
 
 * cpufreq governor fixes related to the relaxing of limit from Michal Pecio.
 
 * OMAP cpufreq updates from Axel Lin and Richard Zhao.
 
 * cpuidle ladder governor fixes related to the disabling of states from
   Carsten Emde and me.
 
 * Runtime PM core updates related to the interactions with the system suspend
   core from Alan Stern and Kevin Hilman.
 
 * Wakeup sources modification allowing more helper functions to be called from
   interrupt context from John Stultz and additional diagnostic code from Todd
   Poynor.
 
 * System suspend error code path fix from Feng Hong.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQIcBAABAgAGBQJQa1rRAAoJEKhOf7ml8uNsYZ0P/2RZ71sgLWcUCfr0yHaiZeOd
 2GxEYSZ+9BZJHADgoAK/bHRTv8crm40Y2RkbaWbxPDRNuE4SutbvNTGTlJSAguSD
 yHkU/6AFC7u8Jwq+afsWIdGX7eHd78zPpj6EVtVtjHM903WDwbMU2vUz7tQ+fFa+
 ZZ7eydq9j0ec0OoH3UeNhet7JSOpT5BSLgjmIkHMBgIvTxNVDbkB31QUxnUxocxn
 k6S2wQaUSJJWGMLksRRNrhwLq+cGYwTsaOtG/KzRLH1raUyn33B5pcZr0aqhOkjg
 ClaCks3V8o3vRghSwOPB5aVXzjBKvM3UnSyJNIl+FeCeyWuwSNbkEFdA/e7oPuxG
 UsW6dcHiuVo6Ir4+zhd9+lN+/AcPTChO5b7lbU8qRF4ce04czWlUY/KzJjaM+YOE
 CKGq6eX9AHwFjE+h4+VcCXgmzcioiS8Y/CPz13u8N1y0zzwW+ftjb12K+7lVBEG1
 fhrePKHgLw3kJ9LqGpR+4vVur7C+rCf6WwCReTY2vXXVYJ+SuKWTRI4zAjTPXtHa
 i9dpMRASpF+ScRYBcgwIpv789WuHATFKqdBSinZUKBaxQZ5flJ2qIrfqN5VeAejh
 oQs/zZCdIuAtFKqVycQ0L42YxFNKgPFKQErUCSu3M5OuZLlLVLu7yQvIo2Xmo9qf
 Hcrpvo5K+w29YkiwGP9e
 =rbCk
 -----END PGP SIGNATURE-----

Merge tag 'pm-for-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael J Wysocki:

 - Improved system suspend/resume and runtime PM handling for the SH
   TMU, CMT and MTU2 clock event devices (also used by ARM/shmobile).

 - Generic PM domains framework extensions related to cpuidle support
   and domain objects lookup using names.

 - ARM/shmobile power management updates including improved support for
   the SH7372's A4S power domain containing the CPU core.

 - cpufreq changes related to AMD CPUs support from Matthew Garrett,
   Andre Przywara and Borislav Petkov.

 - cpu0 cpufreq driver from Shawn Guo.

 - cpufreq governor fixes related to the relaxing of limit from Michal
   Pecio.

 - OMAP cpufreq updates from Axel Lin and Richard Zhao.

 - cpuidle ladder governor fixes related to the disabling of states from
   Carsten Emde and me.

 - Runtime PM core updates related to the interactions with the system
   suspend core from Alan Stern and Kevin Hilman.

 - Wakeup sources modification allowing more helper functions to be
   called from interrupt context from John Stultz and additional
   diagnostic code from Todd Poynor.

 - System suspend error code path fix from Feng Hong.

Fixed up conflicts in cpufreq/powernow-k8 that stemmed from the
workqueue fixes conflicting fairly badly with the removal of support for
hardware P-state chips.  The changes were independent but somewhat
intertwined.

* tag 'pm-for-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
  Revert "PM QoS: Use spinlock in the per-device PM QoS constraints code"
  PM / Runtime: let rpm_resume() succeed if RPM_ACTIVE, even when disabled, v2
  cpuidle: rename function name "__cpuidle_register_driver", v2
  cpufreq: OMAP: Check IS_ERR() instead of NULL for omap_device_get_by_hwmod_name
  cpuidle: remove some empty lines
  PM: Prevent runtime suspend during system resume
  PM QoS: Use spinlock in the per-device PM QoS constraints code
  PM / Sleep: use resume event when call dpm_resume_early
  cpuidle / ACPI : move cpuidle_device field out of the acpi_processor_power structure
  ACPI / processor: remove pointless variable initialization
  ACPI / processor: remove unused function parameter
  cpufreq: OMAP: remove loops_per_jiffy recalculate for smp
  sections: fix section conflicts in drivers/cpufreq
  cpufreq: conservative: update frequency when limits are relaxed
  cpufreq / ondemand: update frequency when limits are relaxed
  properly __init-annotate pm_sysrq_init()
  cpufreq: Add a generic cpufreq-cpu0 driver
  PM / OPP: Initialize OPP table from device tree
  ARM: add cpufreq transiton notifier to adjust loops_per_jiffy for smp
  cpufreq: Remove support for hardware P-state chips from powernow-k8
  ...
2012-10-02 18:32:35 -07:00
Michal Pecio
2d8fced75c cpufreq: conservative: update frequency when limits are relaxed
Reevaluate CPU load and update frequency immediately whenever limits
are changed. Currently conservative doesn't do that when limits are
relaxed, wasting power on systems with relatively low sampling rate.

Signed-off-by: Michal Pecio <mpecio@nvidia.com>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-09-14 21:07:48 +02:00
Amit Daniel Kachhap
2d175069f2 PM / cpufreq: Initialise the cpu field during conservative governor start
This change initialises the cpu id field of cs_cpu_dbs_info structure in
conservative governor and keep this consistent with other governors.
Similar initialisation is present in ondemand governor.

Signed-off-by: Amit Daniel Kachhap <amit.daniel@samsung.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-09-04 01:37:47 +02:00
Tejun Heo
203b42f731 workqueue: make deferrable delayed_work initializer names consistent
Initalizers for deferrable delayed_work are confused.

* __DEFERRED_WORK_INITIALIZER()
* DECLARE_DEFERRED_WORK()
* INIT_DELAYED_WORK_DEFERRABLE()

Rename them to

* __DEFERRABLE_WORK_INITIALIZER()
* DECLARE_DEFERRABLE_WORK()
* INIT_DEFERRABLE_WORK()

This patch doesn't cause any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
2012-08-21 13:18:23 -07:00
Martin Schwidefsky
612ef28a04 Merge branch 'sched/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into cputime-tip
Conflicts:
	drivers/cpufreq/cpufreq_conservative.c
	drivers/cpufreq/cpufreq_ondemand.c
	drivers/macintosh/rack-meter.c
	fs/proc/stat.c
	fs/proc/uptime.c
	kernel/sched/core.c
2011-12-19 19:23:15 +01:00
Martin Schwidefsky
648616343c [S390] cputime: add sparse checking and cleanup
Make cputime_t and cputime64_t nocast to enable sparse checking to
detect incorrect use of cputime. Drop the cputime macros for simple
scalar operations. The conversion macros are still needed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-15 14:56:19 +01:00
Glauber Costa
3292beb340 sched/accounting: Change cpustat fields to an array
This patch changes fields in cpustat from a structure, to an
u64 array. Math gets easier, and the code is more flexible.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Tuner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1322498719-2255-2-git-send-email-glommer@parallels.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06 09:06:38 +01:00
Michal Hocko
6beea0cda8 nohz: Fix update_ts_time_stat idle accounting
update_ts_time_stat currently updates idle time even if we are in
iowait loop at the moment. The only real users of the idle counter
(via get_cpu_idle_time_us) are CPU governors and they expect to get
cumulative time for both idle and iowait times.
The value (idle_sleeptime) is also printed to userspace by print_cpu
but it prints both idle and iowait times so the idle part is misleading.

Let's clean this up and fix update_ts_time_stat to account both counters
properly and update consumers of idle to consider iowait time as well.
If we do this we might use get_cpu_{idle,iowait}_time_us from other
contexts as well and we will get expected values.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Jones <davej@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-09-08 11:10:55 +02:00
Thomas Renninger
326c86deae [CPUFREQ] Remove unneeded locks
There cannot be any concurrent access to these through
different cpu sysfs files anymore, because these tunables
are now all global (not per cpu).

I still have some doubts whether some of these locks
were needed at all. Anyway, let's get rid of them.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
CC: cpufreq@vger.kernel.org
2011-03-16 17:54:32 -04:00
Thomas Renninger
e8951251b8 [CPUFREQ] Remove old, deprecated per cpu ondemand/conservative sysfs files
Marked deprecated for quite a whilte now...

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
CC: cpufreq@vger.kernel.org
2011-03-16 17:54:32 -04:00
Thomas Renninger
ef598549b2 [CPUFREQ] Remove deprecated sysfs file sampling_rate_max
Marked deprecated for quite a while now...

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
CC: cpufreq@vger.kernel.org
2011-03-16 17:54:32 -04:00
Joe Perches
2feb690c20 [CPUFREQ] drivers/cpufreq: Remove unnecessary semicolons
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2011-03-16 17:54:31 -04:00
Tejun Heo
57df5573a5 cpufreq: use system_wq instead of dedicated workqueues
With cmwq, there's no reason for cpufreq drivers to use separate
workqueues.  Remove the dedicated workqueues from cpufreq_conservative
and cpufreq_ondemand and use system_wq instead.  The work items are
already sync canceled on stop, so it's already guaranteed that no work
is running on module exit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Dave Jones <davej@redhat.com>
Cc: cpufreq@vger.kernel.org
2011-01-26 12:12:50 +01:00
H. Peter Anvin
d7be0ce6af Merge commit 'v2.6.34-rc6' into x86/cpu 2010-05-08 14:59:58 -07:00
Borislav Petkov
6dad2a2964 cpufreq: Unify sysfs attribute definition macros
Multiple modules used to define those which are with identical
functionality and were needlessly replicated among the different cpufreq
drivers. Push them into the header and remove duplication.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1270065406-1814-7-git-send-email-bp@amd64.org>
Reviewed-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-09 14:07:56 -07:00
Dominik Brodowski
fd187aaf98 [CPUFREQ] use max load in conservative governor
Instead of using the load of the last CPU in a package, use the
maximum load of all CPUs in a package.

Reported-by: Jean-Christian Goussard <jeanchristian.goussard@sfr.fr>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2010-03-31 12:00:22 -04:00
Thomas Renninger
49b015ce38 [CPUFREQ] Use global sysfs cpufreq structure for conservative governor tunings
Same adustments that have been added to the ondemand recently.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-24 13:33:34 -05:00
Pallipadi, Venkatesh
54c9a35d9f [CPUFREQ] Resolve time unit thinko in ondemand/conservative govs
ondemand and conservative governors are messing up time units in the
code path where NO_HZ is not enabled and ignore_nice is set. The walltime
idletime stored is in jiffies and nice time calculation is happening in
microseconds.

The problem was reported and diagnosed by Alexander here.
http://marc.info/?l=linux-kernel&m=125752550404513&w=2

The patch below fixes this thinko.

Reported-by: Alexander Miller <Miller@fmi.uni-stuttgart.de>
Tested-by: Alexander Miller <Miller@fmi.uni-stuttgart.de>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-11-17 23:15:04 -05:00
Tejun Heo
384be2b18a Merge branch 'percpu-for-linus' into percpu-for-next
Conflicts:
	arch/sparc/kernel/smp_64.c
	arch/x86/kernel/cpu/perf_counter.c
	arch/x86/kernel/setup_percpu.c
	drivers/cpufreq/cpufreq_ondemand.c
	mm/percpu.c

Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids.  As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-08-14 14:45:31 +09:00
Pallipadi, Venkatesh
26d204afa1 [CPUFREQ] Fix NULL pointer dereference regression in conservative governor
Commit ee88415caf
introduced this regression when it removed enable bit in cpu_dbs_info_s.
That added a possibility of dbs_cpufreq_notifier getting called for a
CPU that is not yet managed by conservative governor. That will happen
as the transition notifier is set as soon as one CPU switches to
conservative governor and other CPUs can get a NULL pointer dereference
without the enable bit check. Add the enable bit back again.

Reported-by: Lermytte Christophe <Christophe.Lermytte@thomson.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-08-04 14:32:10 -04:00
venkatesh.pallipadi@intel.com
ee88415caf [CPUFREQ] Cleanup locking in conservative governor
Redesign the locking inside conservative driver. Make dbs_mutex handle all the
global state changes inside the driver and invent a new percpu mutex
to serialize percpu timer and frequency limit change.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-07-06 21:38:28 -04:00
venkatesh.pallipadi@intel.com
7d26e2d5e2 [CPUFREQ] Eliminate the recent lockdep warnings in cpufreq
Commit b14893a62c although it was very
much needed to properly cleanup ondemand timer, opened-up a can of worms
related to locking dependencies in cpufreq.

Patch here defines the need for dbs_mutex and cleans up its usage in
ondemand governor. This also resolves the lockdep warnings reported here

http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html
http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html

and few others..

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-07-06 21:38:27 -04:00