Doug / smarc-fsl-linux-kernel | Embedian Git Server

12 Sep, 2013

5 commits

f1728fd15 Merge branch 'pm-cpufreq' ... Browse Code »

* pm-cpufreq:
cpufreq: Acquire the lock in cpufreq_policy_restore() for reading
cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
cpufreq: Restructure if/else block to avoid unintended behavior
cpufreq: Fix crash in cpufreq-stats during suspend/resume

Rafael J. Wysocki
2013-09-12 19:04:11 +0800
44871c9c7 cpufreq: Acquire the lock in cpufreq_policy_restore() for reading ... Browse Code »

In cpufreq_policy_restore() before system suspend policy is read from
percpu's cpufreq_cpu_data_fallback. It's a read operation rather
than a write one, so take the lock for reading in there.

Signed-off-by: Lan Tianyu
Reviewed-by: Srivatsa S. Bhat
Acked-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Lan Tianyu
2013-09-12 05:30:03 +0800
cb38ed5cf cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu ... Browse Code »

If update_policy_cpu() is invoked with the existing policy->cpu itself
as the new-cpu parameter, then a lot of things can go terribly wrong.

In its present form, update_policy_cpu() always assumes that the new-cpu
is different from policy->cpu and invokes other functions to perform their
respective updates. And those functions implement the actual update like
this:

per_cpu(..., new_cpu) = per_cpu(..., last_cpu);
per_cpu(..., last_cpu) = NULL;

Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu
references vanish into thin air! (memory leak). From there, it leads to more
problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference
and hence tries to create a new sysfs-group; but sysfs already had created
the group earlier, so it complains that it cannot create a duplicate filename.
In short, the repercussions of a rather innocuous invocation of
update_policy_cpu() can turn out to be pretty nasty.

Ideally update_policy_cpu() should handle this situation (new == last)
gracefully, and not lead to such severe problems. So fix it by adding an
appropriate check.

Signed-off-by: Srivatsa S. Bhat
Tested-by: Stephen Warren
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-12 05:29:57 +0800
61173f256 cpufreq: Restructure if/else block to avoid unintended behavior ... Browse Code »

In __cpufreq_remove_dev_prepare(), the code which decides whether to remove
the sysfs link or nominate a new policy cpu, is governed by an if/else block
with a rather complex set of conditionals. Worse, they harbor a subtlety
which leads to certain unintended behavior.

The code looks like this:

if (cpu != policy->cpu && !frozen) {
sysfs_remove_link(&dev->kobj, "cpufreq");
} else if (cpus > 1) {
new_cpu = cpufreq_nominate_new_policy_cpu(...);
...
update_policy_cpu(..., new_cpu);
}

The original intention was:
If the CPU going offline is not policy->cpu, just remove the link.
On the other hand, if the CPU going offline is the policy->cpu itself,
handover the policy->cpu job to some other surviving CPU in that policy.

But because the 'if' condition also includes the 'frozen' check, now there
are *two* possibilities by which we can enter the 'else' block:

1. cpu == policy->cpu (intended)
2. cpu != policy->cpu && frozen (unintended)

Due to the second (unintended) scenario, we end up spuriously nominating
a CPU as the policy->cpu, even when the existing policy->cpu is alive and
well. This can cause problems further down the line, especially when we end
up nominating the same policy->cpu as the new one (ie., old == new),
because it totally confuses update_policy_cpu().

To avoid this mess, restructure the if/else block to only do what was
originally intended, and thus prevent any unwelcome surprises.

Signed-off-by: Srivatsa S. Bhat
Tested-by: Stephen Warren
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-12 05:29:57 +0800
0d66b91eb cpufreq: Fix crash in cpufreq-stats during suspend/resume ... Browse Code »

Stephen Warren reported that the cpufreq-stats code hits a NULL pointer
dereference during the second attempt to suspend a system. He also
pin-pointed the problem to commit 5302c3f "cpufreq: Perform light-weight
init/teardown during suspend/resume".

That commit actually ensured that the cpufreq-stats table and the
cpufreq-stats sysfs entries are *not* torn down (ie., not freed) during
suspend/resume, which makes it all the more surprising. However, it turns
out that the root-cause is not that we access an already freed memory, but
that the reference to the allocated memory gets moved around and we lose
track of that during resume, leading to the reported crash in a subsequent
suspend attempt.

In the suspend path, during CPU offline, the value of policy->cpu is updated
by choosing one of the surviving CPUs in that policy, as long as there is
atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is
invoked to update the reference to the stats structure by assigning it to
the new CPU. However, in the resume path, during CPU online, we end up
assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats
know about this. Thus the reference to the stats structure remains
(incorrectly) associated with the old CPU. So, in a subsequent suspend attempt,
during CPU offline, we end up accessing an incorrect location to get the
stats structure, which eventually leads to the NULL pointer dereference.

Fix this by letting cpufreq-stats know about the update of the policy->cpu
during CPU online in the resume path. (Also, move the update_policy_cpu()
function higher up in the file, so that __cpufreq_add_dev() can invoke
it).

Reported-and-tested-by: Stephen Warren
Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-12 05:29:57 +0800

11 Sep, 2013

2 commits

0df03a30c Merge branch 'pm-cpufreq' ... Browse Code »

* pm-cpufreq:
intel_pstate: Add Haswell CPU models
Revert "cpufreq: make sure frequency transitions are serialized"
cpufreq: Use signed type for 'ret' variable, to store negative error values
cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
cpufreq: Split __cpufreq_remove_dev() into two parts
cpufreq: Fix wrong time unit conversion
cpufreq: serialize calls to __cpufreq_governor()
cpufreq: don't allow governor limits to be changed when it is disabled

Rafael J. Wysocki
2013-09-11 21:23:15 +0800
6cdcdb793 intel_pstate: Add Haswell CPU models ... Browse Code »

Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
model (0x3E) is also included. Models referenced from
tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit

Signed-off-by: Nell Hardcastle
Acked-by: Viresh Kumar
Acked-by: Dirk Brandewie
Signed-off-by: Rafael J. Wysocki

Nell Hardcastle
2013-09-11 05:10:39 +0800

10 Sep, 2013

9 commits

798282a87 Revert "cpufreq: make sure frequency transitions are serialized" ... Browse Code »

Commit 7c30ed5 (cpufreq: make sure frequency transitions are
serialized) attempted to serialize frequency transitions by
adding checks to the CPUFREQ_PRECHANGE and CPUFREQ_POSTCHANGE
notifications. However, it assumed that the notifications will
always originate from the driver's .target() callback, but they
also can be triggered by cpufreq_out_of_sync() and that leads to
warnings like this on some systems:

WARNING: CPU: 0 PID: 14543 at drivers/cpufreq/cpufreq.c:317
__cpufreq_notify_transition+0x238/0x260()
In middle of another frequency transition

accompanied by a call trace similar to this one:

[] dump_stack+0x46/0x58
[] warn_slowpath_common+0x8c/0xc0
[] ? acpi_cpufreq_target+0x320/0x320
[] warn_slowpath_fmt+0x46/0x50
[] __cpufreq_notify_transition+0x238/0x260
[] cpufreq_notify_transition+0x3e/0x70
[] cpufreq_out_of_sync+0x6d/0xb0
[] cpufreq_update_policy+0x10c/0x160
[] ? cpufreq_update_policy+0x160/0x160
[] cpufreq_set_cur_state+0x8c/0xb5
[] processor_set_cur_state+0xa3/0xcf
[] thermal_cdev_update+0x9c/0xb0
[] step_wise_throttle+0x5a/0x90
[] handle_thermal_trip+0x4f/0x140
[] thermal_zone_device_update+0x57/0xa0
[] acpi_thermal_check+0x2e/0x30
[] acpi_thermal_notify+0x40/0xdc
[] acpi_device_notify+0x19/0x1b
[] acpi_ev_notify_dispatch+0x41/0x5c
[] acpi_os_execute_deferred+0x25/0x32
[] process_one_work+0x170/0x4a0
[] worker_thread+0x121/0x390
[] ? manage_workers.isra.20+0x170/0x170
[] kthread+0xc0/0xd0
[] ? flush_kthread_worker+0xb0/0xb0
[] ret_from_fork+0x7c/0xb0
[] ? flush_kthread_worker+0xb0/0xb0

For this reason, revert commit 7c30ed5 along with the fix 266c13d
(cpufreq: Fix serialization of frequency transitions) on top of it
and we will revisit the serialization problem later.

Reported-by: Alessandro Bono
Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2013-09-10 08:54:50 +0800
5136fa565 cpufreq: Use signed type for 'ret' variable, to store negative error values ... Browse Code »

There are places where the variable 'ret' is declared as unsigned int
and then used to store negative return values such as -EINVAL. Fix them
by declaring the variable as a signed quantity.

Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-10 08:49:48 +0800
56d07db27 cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes ... Browse Code »

Commit "cpufreq: serialize calls to __cpufreq_governor()" had been a temporary
and partial solution to the race condition between writing to a cpufreq sysfs
file and taking a CPU offline. Now that we have a proper and complete solution
to that problem, remove the temporary fix.

Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-10 08:49:47 +0800
4f750c930 cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug ... Browse Code »

The functions that are used to write to cpufreq sysfs files (such as
store_scaling_max_freq()) are not hotplug safe. They can race with CPU
hotplug tasks and lead to problems such as trying to acquire an already
destroyed timer-mutex etc.

Eg:

__cpufreq_remove_dev()
__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
policy->governor->governor(policy, CPUFREQ_GOV_STOP);
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
mutex_destroy(&cpu_cdbs->timer_mutex)
cpu_cdbs->cur_policy = NULL;

store()
__cpufreq_set_policy()
__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
case CPUFREQ_GOV_LIMITS:
mutex_lock(&cpu_cdbs->timer_mutex); max < cpu_cdbs->cur_policy->cur)
Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-10 08:49:47 +0800
1aee40ac9 cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock ... Browse Code »

__cpufreq_remove_dev_finish() handles the kobject cleanup for a CPU going
offline. But because we destroy the kobject towards the end of the CPU offline
phase, there are certain race windows where a task can try to write to a
cpufreq sysfs file (eg: using store_scaling_max_freq()) while we are taking
that CPU offline, and this can bump up the kobject refcount, which in turn might
hinder the CPU offline task from running to completion. (It can also cause
other more serious problems such as trying to acquire a destroyed timer-mutex
etc., depending on the exact stage of the cleanup at which the task managed to
take a new refcount).

To fix the race window, we will need to synchronize those store_*() call-sites
with CPU hotplug, using get_online_cpus()/put_online_cpus(). However, that
in turn can cause a total deadlock because it can end up waiting for the
CPU offline task to complete, with incremented refcount!

Write to sysfs CPU offline task
-------------- ----------------
kobj_refcnt++

Acquire cpu_hotplug.lock

get_online_cpus();

Wait for kobj_refcnt to drop to zero

**DEADLOCK**

A simple way to avoid this problem is to perform the kobject cleanup in the
CPU offline path, with the cpu_hotplug.lock *released*. That is, we can
perform the wait-for-kobj-refcnt-to-drop as well as the subsequent cleanup
in the CPU_POST_DEAD stage of CPU offline, which is run with cpu_hotplug.lock
released. Doing this helps us avoid deadlocks due to holding kobject refcounts
and waiting on each other on the cpu_hotplug.lock.

(Note: We can't move all of the cpufreq CPU offline steps to the
CPU_POST_DEAD stage, because certain things such as stopping the governors
have to be done before the outgoing CPU is marked offline. So retain those
parts in the CPU_DOWN_PREPARE stage itself).

Reported-by: Stephen Boyd
Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-10 08:49:47 +0800
cedb70afd cpufreq: Split __cpufreq_remove_dev() into two parts ... Browse Code »

During CPU offline, the cpufreq core invokes __cpufreq_remove_dev()
to perform work such as stopping the cpufreq governor, clearing the
CPU from the policy structure etc, and finally cleaning up the
kobject.

There are certain subtle issues related to the kobject cleanup, and
it would be much easier to deal with them if we separate that part
from the rest of the cleanup-work in the CPU offline phase. So split
the __cpufreq_remove_dev() function into 2 parts: one that handles
the kobject cleanup, and the other that handles the rest of the work.

Reported-by: Stephen Boyd
Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2013-09-10 08:49:46 +0800
a857c0b9e cpufreq: Fix wrong time unit conversion ... Browse Code »

The time spent by a CPU under a given frequency is stored in jiffies unit
in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
the frequency.

This is what is displayed in the following file on the right column:

cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
2301000 19835820
2300000 3172
[...]

Now cpufreq converts this jiffies unit delta to clock_t before returning it
to the user as in the above file. And that conversion is achieved using the API
cputime64_to_clock_t().

Although it accidentally works on traditional tick based cputime accounting, where
cputime_t maps directly to jiffies, it doesn't work with other types of cputime
accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
or any granularity preffered by the architecture.

For example we get a buggy zero delta on full dyntick configurations:

cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
2301000 0
2300000 0
[...]

Fix this with using the proper jiffies_64_t to clock_t conversion.

Reported-and-tested-by: Carsten Emde
Signed-off-by: Andreas Schwab
Signed-off-by: Frederic Weisbecker
Acked-by: Paul E. McKenney
Signed-off-by: Rafael J. Wysocki

Andreas Schwab
2013-09-10 08:49:46 +0800
19c763031 cpufreq: serialize calls to __cpufreq_governor() ... Browse Code »

We can't take a big lock around __cpufreq_governor() as this causes
recursive locking for some cases. But calls to this routine must be
serialized for every policy. Otherwise we can see some unpredictable
events.

For example, consider following scenario:

__cpufreq_remove_dev()
__cpufreq_governor(policy, CPUFREQ_GOV_STOP);
policy->governor->governor(policy, CPUFREQ_GOV_STOP);
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
mutex_destroy(&cpu_cdbs->timer_mutex)
cpu_cdbs->cur_policy = NULL;

store()
__cpufreq_set_policy()
__cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
case CPUFREQ_GOV_LIMITS:
mutex_lock(&cpu_cdbs->timer_mutex); max < cpu_cdbs->cur_policy->cur)
Signed-off-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Viresh Kumar
2013-09-10 08:49:46 +0800
f73d39338 cpufreq: don't allow governor limits to be changed when it is disabled ... Browse Code »

__cpufreq_governor() returns with -EBUSY when governor is already
stopped and we try to stop it again, but when it is stopped we must
not allow calls to CPUFREQ_GOV_LIMITS event as well.

This patch adds this check in __cpufreq_governor().

Reported-by: Stephen Boyd
Signed-off-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Viresh Kumar
2013-09-10 08:49:45 +0800

04 Sep, 2013

1 commit

40031da44 Merge tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull ACPI and power management updates from Rafael Wysocki:

1) ACPI-based PCI hotplug (ACPIPHP) subsystem rework and introduction
of Intel Thunderbolt support on systems that use ACPI for signalling
Thunderbolt hotplug events. This also should make ACPIPHP work in
some cases in which it was known to have problems. From
Rafael J Wysocki, Mika Westerberg and Kirill A Shutemov.

2) ACPI core code cleanups and dock station support cleanups from
Jiang Liu and Rafael J Wysocki.

3) Fixes for locking problems related to ACPI device hotplug from
Rafael J Wysocki.

4) ACPICA update to version 20130725 includig fixes, cleanups, support
for more than 256 GPEs per GPE block and a change to make the ACPI
PM Timer optional (we've seen systems without the PM Timer in the
field already). One of the fixes, related to the DeRefOf operator,
is necessary to prevent some Windows 8 oriented AML from causing
problems to happen. From Bob Moore, Lv Zheng, and Jung-uk Kim.

5) Removal of the old and long deprecated /proc/acpi/event interface
and related driver changes from Thomas Renninger.

6) ACPI and Xen changes to make the reduced hardware sleep work with
the latter from Ben Guthro.

7) ACPI video driver cleanups and a blacklist of systems that should
not tell the BIOS that they are compatible with Windows 8 (or ACPI
backlight and possibly other things will not work on them). From
Felipe Contreras.

8) Assorted ACPI fixes and cleanups from Aaron Lu, Hanjun Guo,
Kuppuswamy Sathyanarayanan, Lan Tianyu, Sachin Kamat, Tang Chen,
Toshi Kani, and Wei Yongjun.

9) cpufreq ondemand governor target frequency selection change to
reduce oscillations between min and max frequencies (essentially,
it causes the governor to choose target frequencies proportional
to load) from Stratos Karafotis.

10) cpufreq fixes allowing sysfs attributes file permissions to be
preserved over suspend/resume cycles Srivatsa S Bhat.

11) Removal of Device Tree parsing for CPU device nodes from multiple
cpufreq drivers that required some changes related to
of_get_cpu_node() to be made in a few architectures and in the
driver core. From Sudeep KarkadaNagesha.

12) cpufreq core fixes and cleanups related to mutual exclusion and
driver module references from Viresh Kumar, Lukasz Majewski and
Rafael J Wysocki.

13) Assorted cpufreq fixes and cleanups from Amit Daniel Kachhap,
Bartlomiej Zolnierkiewicz, Hanjun Guo, Jingoo Han, Joseph Lo,
Julia Lawall, Li Zhong, Mark Brown, Sascha Hauer, Stephen Boyd,
Stratos Karafotis, and Viresh Kumar.

14) Fixes to prevent race conditions in coupled cpuidle from happening
from Colin Cross.

15) cpuidle core fixes and cleanups from Daniel Lezcano and
Tuukka Tikkanen.

16) Assorted cpuidle fixes and cleanups from Daniel Lezcano,
Geert Uytterhoeven, Jingoo Han, Julia Lawall, Linus Walleij,
and Sahara.

17) System sleep tracing changes from Todd E Brandt and Shuah Khan.

18) PNP subsystem conversion to using struct dev_pm_ops for power
management from Shuah Khan.

* tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (217 commits)
cpufreq: Don't use smp_processor_id() in preemptible context
cpuidle: coupled: fix race condition between pokes and safe state
cpuidle: coupled: abort idle if pokes are pending
cpuidle: coupled: disable interrupts after entering safe state
ACPI / hotplug: Remove containers synchronously
driver core / ACPI: Avoid device hot remove locking issues
cpufreq: governor: Fix typos in comments
cpufreq: governors: Remove duplicate check of target freq in supported range
cpufreq: Fix timer/workqueue corruption due to double queueing
ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
ACPI / thermal: Add check of "_TZD" availability and evaluating result
cpufreq: imx6q: Fix clock enable balance
ACPI: blacklist win8 OSI for buggy laptops
cpufreq: tegra: fix the wrong clock name
cpuidle: Change struct menu_device field types
cpuidle: Add a comment warning about possible overflow
cpuidle: Fix variable domains in get_typical_interval()
cpuidle: Fix menu_device->intervals type
cpuidle: CodingStyle: Break up multiple assignments on single line
cpuidle: Check called function parameter in get_typical_interval()
...

Linus Torvalds
2013-09-04 06:59:39 +0800

01 Sep, 2013

1 commit

f27a5fb42 Merge remote-tracking branch 'regulator/topic/optional' into regulator-next Browse Code »

Mark Brown
2013-09-01 20:50:17 +0800

30 Aug, 2013

1 commit

693207837 cpufreq: Don't use smp_processor_id() in preemptible context ... Browse Code »

Workqueues are preemptible even if works are queued on them with
queue_work_on(). Let's use raw_smp_processor_id() here to silence
the warning.

BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
caller is gov_queue_work+0x28/0xb0
CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G W 3.10.0 #30
Workqueue: events od_dbs_timer
[] (unwind_backtrace+0x0/0x11c) from [] (show_stack+0x10/0x14)
[] (show_stack+0x10/0x14) from [] (debug_smp_processor_id+0xbc/0xf0)
[] (debug_smp_processor_id+0xbc/0xf0) from [] (gov_queue_work+0x28/0xb0)
[] (gov_queue_work+0x28/0xb0) from [] (od_dbs_timer+0x108/0x134)
[] (od_dbs_timer+0x108/0x134) from [] (process_one_work+0x25c/0x444)
[] (process_one_work+0x25c/0x444) from [] (worker_thread+0x200/0x344)
[] (worker_thread+0x200/0x344) from [] (kthread+0xa0/0xb0)
[] (kthread+0xa0/0xb0) from [] (ret_from_fork+0x14/0x3c)

Signed-off-by: Stephen Boyd
Acked-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Stephen Boyd
2013-08-30 04:19:23 +0800

29 Aug, 2013

3 commits

c4afc4109 cpufreq: governor: Fix typos in comments ... Browse Code »

- 'Governer' should be 'Governor'.
- 'S' is used for Siemens (electrical conductance) in SI units,
so use small 's' for seconds.

Signed-off-by: Stratos Karafotis
Acked-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Stratos Karafotis
2013-08-29 04:04:54 +0800
934dac1ea cpufreq: governors: Remove duplicate check of target freq in supported range ... Browse Code »

Function __cpufreq_driver_target() checks if target_freq is within
policy->min and policy->max range. generic_powersave_bias_target() also
checks if target_freq is valid via a cpufreq_frequency_table_target()
call. So, drop the unnecessary duplicate check in *_check_cpu().

Signed-off-by: Stratos Karafotis
Acked-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Stratos Karafotis
2013-08-29 04:03:02 +0800
3617f2ca6 cpufreq: Fix timer/workqueue corruption due to double queueing ... Browse Code »

When a CPU is hot removed we'll cancel all the delayed work items
via gov_cancel_work(). Normally this will just cancels a delayed
timer on each CPU that the policy is managing and the work won't
run, but if the work is already running the workqueue code will
wait for the work to finish before continuing to prevent the
work items from re-queuing themselves like they normally do. This
scheme will work most of the time, except for the case where the
work function determines that it should adjust the delay for all
other CPUs that the policy is managing. If this scenario occurs,
the canceling CPU will cancel its own work but queue up the other
CPUs works to run. For example:

CPU0 CPU1
---- ----
cpu_down()
...
__cpufreq_remove_dev()
cpufreq_governor_dbs()
case CPUFREQ_GOV_STOP:
gov_cancel_work(dbs_data, policy);
cpu0 work is canceled
timer is canceled
cpu1 work is canceled
od_dbs_timer()
gov_queue_work(*, *, true);
cpu0 work queued
cpu1 work queued
cpu2 work queued
...
cpu1 work is canceled
cpu2 work is canceled
...

At the end of the GOV_STOP case cpu0 still has a work queued to
run although the code is expecting all of the works to be
canceled. __cpufreq_remove_dev() will then proceed to
re-initialize all the other CPUs works except for the CPU that is
going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
will trample over the queued work and debugobjects will spit out
a warning:

WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10
Modules linked in:
CPU: 0 PID: 1491 Comm: sh Tainted: G W 3.10.0 #19
[] (unwind_backtrace+0x0/0x11c) from [] (show_stack+0x10/0x14)
[] (show_stack+0x10/0x14) from [] (warn_slowpath_common+0x4c/0x6c)
[] (warn_slowpath_common+0x4c/0x6c) from [] (warn_slowpath_fmt+0x2c/0x3c)
[] (warn_slowpath_fmt+0x2c/0x3c) from [] (debug_print_object+0x94/0xbc)
[] (debug_print_object+0x94/0xbc) from [] (__debug_object_init+0x2d0/0x340)
[] (__debug_object_init+0x2d0/0x340) from [] (init_timer_key+0x14/0xb0)
[] (init_timer_key+0x14/0xb0) from [] (cpufreq_governor_dbs+0x3e8/0x5f8)
[] (cpufreq_governor_dbs+0x3e8/0x5f8) from [] (__cpufreq_governor+0xdc/0x1a4)
[] (__cpufreq_governor+0xdc/0x1a4) from [] (__cpufreq_remove_dev.isra.10+0x3b4/0x434)
[] (__cpufreq_remove_dev.isra.10+0x3b4/0x434) from [] (cpufreq_cpu_callback+0x60/0x80)
[] (cpufreq_cpu_callback+0x60/0x80) from [] (notifier_call_chain+0x38/0x68)
[] (notifier_call_chain+0x38/0x68) from [] (__cpu_notify+0x28/0x40)
[] (__cpu_notify+0x28/0x40) from [] (_cpu_down+0x7c/0x2c0)
[] (_cpu_down+0x7c/0x2c0) from [] (cpu_down+0x24/0x40)
[] (cpu_down+0x24/0x40) from [] (store_online+0x2c/0x74)
[] (store_online+0x2c/0x74) from [] (dev_attr_store+0x18/0x24)
[] (dev_attr_store+0x18/0x24) from [] (sysfs_write_file+0x100/0x148)
[] (sysfs_write_file+0x100/0x148) from [] (vfs_write+0xcc/0x174)
[] (vfs_write+0xcc/0x174) from [] (SyS_write+0x38/0x64)
[] (SyS_write+0x38/0x64) from [] (ret_fast_syscall+0x0/0x30)

Signed-off-by: Stephen Boyd
Acked-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Stephen Boyd
2013-08-29 03:57:13 +0800

27 Aug, 2013

1 commit

f7b2ed43b Merge branch 'cpufreq-fixes' of git://git.linaro.org/people/vireshk/linux into pm-cpufreq ... Browse Code »

Pull cpufreq fixes for v3.12 from Viresh Kumar.

* 'cpufreq-fixes' of git://git.linaro.org/people/vireshk/linux:
cpufreq: imx6q: Fix clock enable balance
cpufreq: tegra: fix the wrong clock name

Rafael J. Wysocki
2013-08-27 08:37:54 +0800

26 Aug, 2013

1 commit

fae19b847 cpufreq: imx6q: Fix clock enable balance ... Browse Code »

For changing the cpu frequency the i.MX6q has to be switched to some
intermediate clock during the PLL reprogramming. The driver tries
to be clever to keep the enable count correct but gets it wrong. If
the cpufreq is increased it calls clk_disable_unprepare twice
on pll2_pfd2_396m. This puts all other devices which get their clock
from pll2_pfd2_396m into a nonworking state.

Fix this by removing the clk enabling/disabling altogether since the
clk core will do this automatically during a reparent.

Signed-off-by: Sascha Hauer
Signed-off-by: Viresh Kumar

Sascha Hauer
2013-08-26 22:04:07 +0800

24 Aug, 2013

1 commit

b192b910f cpufreq: tegra: fix the wrong clock name ... Browse Code »

The "cpu" and "pclk_p_cclk" was a virtual clock name that was used in
the legacy Tegra clock framework. It was not used after converting to
CCF. Fix it as the correct clock name that we are using.

Tested-by: Stephen Warren
Signed-off-by: Joseph Lo
Signed-off-by: Viresh Kumar

Joseph Lo
2013-08-24 00:28:28 +0800

23 Aug, 2013

1 commit

09198f8fe Merge branch 'cpu_of_node' of git://linux-arm.org/linux-skn into pm-cpufreq-next ... Browse Code »

Pull DT/core/cpufreq cpu_ofnode updates for v3.12 from Sudeep KarkadaNagesha.

* 'cpu_of_node' of git://linux-arm.org/linux-skn:
cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes
cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes
cpufreq: arm_big_little: remove device tree parsing for cpu nodes
cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes
cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes
cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes
cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes
drivers/bus: arm-cci: avoid parsing DT for cpu device nodes
ARM: mvebu: remove device tree parsing for cpu nodes
ARM: topology: remove hwid/MPIDR dependency from cpu_capacity
of/device: add helper to get cpu device node from logical cpu index
driver/core: cpu: initialize of_node in cpu's device struture
ARM: DT/kernel: define ARM specific arch_match_cpu_phys_id
of: move of_get_cpu_node implementation to DT core library
powerpc: refactor of_get_cpu_node to support other architectures
openrisc: remove undefined of_get_cpu_node declaration
microblaze: remove undefined of_get_cpu_node declaration

Rafael J. Wysocki
2013-08-23 06:57:19 +0800

21 Aug, 2013

10 commits

1037b2752 cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes DT parsing and uses cpu->of_node instead.

Cc: Benjamin Herrenschmidt
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:56 +0800
760287ab9 cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Cc: Benjamin Herrenschmidt
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:56 +0800
2421d4c34 cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Cc: Dmitry Eremin-Solenikov
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:55 +0800
da0eb143d cpufreq: arm_big_little: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:55 +0800
e768f350c cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Cc: Jason Cooper
Acked-by: Andrew Lunn
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:55 +0800
c0e469487 cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Cc: Deepak Sikri
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:54 +0800
5de6e94a2 cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Cc: Mark Langsdorf
Acked-by: Rob Herring
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:54 +0800
f837a9b5a cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Acked-by: Shawn Guo
Acked-by: Rob Herring
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:53 +0800
cdc58d602 cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes ... Browse Code »

Now that the cpu device registration initialises the of_node(if available)
appropriately for all the cpus, parsing here is redundant.

This patch removes all DT parsing and uses cpu->of_node instead.

Acked-by: Shawn Guo
Acked-by: Viresh Kumar
Signed-off-by: Sudeep KarkadaNagesha

Sudeep KarkadaNagesha
2013-08-21 17:29:53 +0800
5025d628c cpufreq: fix bad unlock balance on !CONFIG_SMP ... Browse Code »

This patch tries to fix lockdep complaint attached below.

It seems that we should always read acquire the cpufreq_rwsem,
whether CONFIG_SMP is enabled or not. And CONFIG_HOTPLUG_CPU
depends on CONFIG_SMP, so it seems we don't need CONFIG_SMP for the
code enabled by CONFIG_HOTPLUG_CPU.

[ 0.504191] =====================================
[ 0.504627] [ BUG: bad unlock balance detected! ]
[ 0.504627] 3.11.0-rc6-next-20130819 #1 Not tainted
[ 0.504627] -------------------------------------
[ 0.504627] swapper/1 is trying to release lock (cpufreq_rwsem) at:
[ 0.504627] [] cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] but there are no more locks to release!
[ 0.504627]
[ 0.504627] other info that might help us debug this:
[ 0.504627] 1 lock held by swapper/1:
[ 0.504627] #0: (subsys mutex#4){+.+.+.}, at: [] subsys_interface_register+0x4f/0xe0
[ 0.504627]
[ 0.504627] stack backtrace:
[ 0.504627] CPU: 0 PID: 1 Comm: swapper Not tainted 3.11.0-rc6-next-20130819 #1
[ 0.504627] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 0.504627] ffffffff813d927a ffff88007f847c98 ffffffff814c062b ffff88007f847cc8
[ 0.504627] ffffffff81098bce ffff88007f847cf8 ffffffff81aadc30 ffffffff813d927a
[ 0.504627] 00000000ffffffff ffff88007f847d68 ffffffff8109d0be 0000000000000006
[ 0.504627] Call Trace:
[ 0.504627] [] ? cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [] dump_stack+0x19/0x1b
[ 0.504627] [] print_unlock_imbalance_bug+0xfe/0x110
[ 0.504627] [] ? cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [] lock_release_non_nested+0x1ee/0x310
[ 0.504627] [] ? mark_held_locks+0xae/0x120
[ 0.504627] [] ? kfree+0xcb/0x1d0
[ 0.504627] [] ? cpufreq_policy_free+0x4a/0x60
[ 0.504627] [] ? cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [] lock_release+0xc4/0x250
[ 0.504627] [] up_read+0x23/0x40
[ 0.504627] [] cpufreq_add_dev+0x13a/0x3e0
[ 0.504627] [] subsys_interface_register+0x99/0xe0
[ 0.504627] [] ? cpufreq_gov_dbs_init+0x12/0x12
[ 0.504627] [] cpufreq_register_driver+0x9d/0x1d0
[ 0.504627] [] ? cpufreq_gov_dbs_init+0x12/0x12
[ 0.504627] [] acpi_cpufreq_init+0xfe/0x1f8
[ 0.504627] [] do_one_initcall+0xda/0x180
[ 0.504627] [] kernel_init_freeable+0x12c/0x1bb
[ 0.504627] [] ? do_early_param+0x8c/0x8c
[ 0.504627] [] ? rest_init+0x140/0x140
[ 0.504627] [] kernel_init+0xe/0xf0
[ 0.504627] [] ret_from_fork+0x7a/0xb0
[ 0.504627] [] ? rest_init+0x140/0x140

Signed-off-by: Li Zhong
Acked-and-tested-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Li Zhong
2013-08-21 08:04:31 +0800

20 Aug, 2013

4 commits

1b2742944 cpufreq: Use cpufreq_policy_list for iterating over policies ... Browse Code »

To iterate over all policies we currently iterate over all online
CPUs and then get the policy for each of them which is suboptimal.
Use the newly created cpufreq_policy_list for this purpose instead.

Signed-off-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Viresh Kumar
2013-08-20 21:43:50 +0800
474deff74 cpufreq: remove cpufreq_policy_cpu per-cpu variable ... Browse Code »

cpufreq_policy_cpu per-cpu variables are used for storing the ID of
the CPU that manages the given CPU's policy. However, we also store
a policy pointer for each cpu in cpufreq_cpu_data, so the
cpufreq_policy_cpu information is simply redundant.

It is better to use cpufreq_cpu_data to retrieve a policy and get
policy->cpu from there, so make that happen everywhere and drop the
cpufreq_policy_cpu per-cpu variables which aren't necessary any more.

[rjw: Changelog]
Signed-off-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Viresh Kumar
2013-08-20 21:43:50 +0800
9e9fd8016 cpufreq: remove unnecessary check in __cpufreq_governor() ... Browse Code »

We don't need to check if event is CPUFREQ_GOV_POLICY_INIT and put
governor module as we are sure event can only be START/STOP here.

Remove the useless check.

Signed-off-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Viresh Kumar
2013-08-20 21:43:50 +0800
9515f4d69 cpufreq: remove policy from cpufreq_policy_list during suspend ... Browse Code »

cpufreq_policy_list is a list of active policies. We do remove
policies from this list when all CPUs belonging to that policy are
removed. But during system suspend we don't really free a policy
struct as it will be used again during resume, so we didn't remove
it from cpufreq_policy_list as well..

However, this is incorrect. We are saying this policy isn't valid
anymore and must not be referenced (though we haven't freed it), but
it can still be used by code that iterates over cpufreq_policy_list.

Remove policy from this list during system suspend as well.
Of course, we must add it back whenever the first CPU belonging to
that policy shows up.

[rjw: Changelog]
Signed-off-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Viresh Kumar
2013-08-20 21:43:50 +0800