12 Sep, 2013

5 commits

  • * pm-cpufreq:
    cpufreq: Acquire the lock in cpufreq_policy_restore() for reading
    cpufreq: Prevent problems in update_policy_cpu() if last_cpu == new_cpu
    cpufreq: Restructure if/else block to avoid unintended behavior
    cpufreq: Fix crash in cpufreq-stats during suspend/resume

    Rafael J. Wysocki
     
  • In cpufreq_policy_restore() before system suspend policy is read from
    percpu's cpufreq_cpu_data_fallback. It's a read operation rather
    than a write one, so take the lock for reading in there.

    Signed-off-by: Lan Tianyu
    Reviewed-by: Srivatsa S. Bhat
    Acked-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Lan Tianyu
     
  • If update_policy_cpu() is invoked with the existing policy->cpu itself
    as the new-cpu parameter, then a lot of things can go terribly wrong.

    In its present form, update_policy_cpu() always assumes that the new-cpu
    is different from policy->cpu and invokes other functions to perform their
    respective updates. And those functions implement the actual update like
    this:

    per_cpu(..., new_cpu) = per_cpu(..., last_cpu);
    per_cpu(..., last_cpu) = NULL;

    Thus, when new_cpu == last_cpu, the final NULL assignment makes the per-cpu
    references vanish into thin air! (memory leak). From there, it leads to more
    problems: cpufreq_stats_create_table() now doesn't find the per-cpu reference
    and hence tries to create a new sysfs-group; but sysfs already had created
    the group earlier, so it complains that it cannot create a duplicate filename.
    In short, the repercussions of a rather innocuous invocation of
    update_policy_cpu() can turn out to be pretty nasty.

    Ideally update_policy_cpu() should handle this situation (new == last)
    gracefully, and not lead to such severe problems. So fix it by adding an
    appropriate check.

    Signed-off-by: Srivatsa S. Bhat
    Tested-by: Stephen Warren
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     
  • In __cpufreq_remove_dev_prepare(), the code which decides whether to remove
    the sysfs link or nominate a new policy cpu, is governed by an if/else block
    with a rather complex set of conditionals. Worse, they harbor a subtlety
    which leads to certain unintended behavior.

    The code looks like this:

    if (cpu != policy->cpu && !frozen) {
    sysfs_remove_link(&dev->kobj, "cpufreq");
    } else if (cpus > 1) {
    new_cpu = cpufreq_nominate_new_policy_cpu(...);
    ...
    update_policy_cpu(..., new_cpu);
    }

    The original intention was:
    If the CPU going offline is not policy->cpu, just remove the link.
    On the other hand, if the CPU going offline is the policy->cpu itself,
    handover the policy->cpu job to some other surviving CPU in that policy.

    But because the 'if' condition also includes the 'frozen' check, now there
    are *two* possibilities by which we can enter the 'else' block:

    1. cpu == policy->cpu (intended)
    2. cpu != policy->cpu && frozen (unintended)

    Due to the second (unintended) scenario, we end up spuriously nominating
    a CPU as the policy->cpu, even when the existing policy->cpu is alive and
    well. This can cause problems further down the line, especially when we end
    up nominating the same policy->cpu as the new one (ie., old == new),
    because it totally confuses update_policy_cpu().

    To avoid this mess, restructure the if/else block to only do what was
    originally intended, and thus prevent any unwelcome surprises.

    Signed-off-by: Srivatsa S. Bhat
    Tested-by: Stephen Warren
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     
  • Stephen Warren reported that the cpufreq-stats code hits a NULL pointer
    dereference during the second attempt to suspend a system. He also
    pin-pointed the problem to commit 5302c3f "cpufreq: Perform light-weight
    init/teardown during suspend/resume".

    That commit actually ensured that the cpufreq-stats table and the
    cpufreq-stats sysfs entries are *not* torn down (ie., not freed) during
    suspend/resume, which makes it all the more surprising. However, it turns
    out that the root-cause is not that we access an already freed memory, but
    that the reference to the allocated memory gets moved around and we lose
    track of that during resume, leading to the reported crash in a subsequent
    suspend attempt.

    In the suspend path, during CPU offline, the value of policy->cpu is updated
    by choosing one of the surviving CPUs in that policy, as long as there is
    atleast one CPU in that policy. And cpufreq_stats_update_policy_cpu() is
    invoked to update the reference to the stats structure by assigning it to
    the new CPU. However, in the resume path, during CPU online, we end up
    assigning a fresh CPU as the policy->cpu, without letting cpufreq-stats
    know about this. Thus the reference to the stats structure remains
    (incorrectly) associated with the old CPU. So, in a subsequent suspend attempt,
    during CPU offline, we end up accessing an incorrect location to get the
    stats structure, which eventually leads to the NULL pointer dereference.

    Fix this by letting cpufreq-stats know about the update of the policy->cpu
    during CPU online in the resume path. (Also, move the update_policy_cpu()
    function higher up in the file, so that __cpufreq_add_dev() can invoke
    it).

    Reported-and-tested-by: Stephen Warren
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     

11 Sep, 2013

2 commits

  • * pm-cpufreq:
    intel_pstate: Add Haswell CPU models
    Revert "cpufreq: make sure frequency transitions are serialized"
    cpufreq: Use signed type for 'ret' variable, to store negative error values
    cpufreq: Remove temporary fix for race between CPU hotplug and sysfs-writes
    cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug
    cpufreq: Invoke __cpufreq_remove_dev_finish() after releasing cpu_hotplug.lock
    cpufreq: Split __cpufreq_remove_dev() into two parts
    cpufreq: Fix wrong time unit conversion
    cpufreq: serialize calls to __cpufreq_governor()
    cpufreq: don't allow governor limits to be changed when it is disabled

    Rafael J. Wysocki
     
  • Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
    model (0x3E) is also included. Models referenced from
    tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit

    Signed-off-by: Nell Hardcastle
    Acked-by: Viresh Kumar
    Acked-by: Dirk Brandewie
    Signed-off-by: Rafael J. Wysocki

    Nell Hardcastle
     

10 Sep, 2013

9 commits

  • Commit 7c30ed5 (cpufreq: make sure frequency transitions are
    serialized) attempted to serialize frequency transitions by
    adding checks to the CPUFREQ_PRECHANGE and CPUFREQ_POSTCHANGE
    notifications. However, it assumed that the notifications will
    always originate from the driver's .target() callback, but they
    also can be triggered by cpufreq_out_of_sync() and that leads to
    warnings like this on some systems:

    WARNING: CPU: 0 PID: 14543 at drivers/cpufreq/cpufreq.c:317
    __cpufreq_notify_transition+0x238/0x260()
    In middle of another frequency transition

    accompanied by a call trace similar to this one:

    [] dump_stack+0x46/0x58
    [] warn_slowpath_common+0x8c/0xc0
    [] ? acpi_cpufreq_target+0x320/0x320
    [] warn_slowpath_fmt+0x46/0x50
    [] __cpufreq_notify_transition+0x238/0x260
    [] cpufreq_notify_transition+0x3e/0x70
    [] cpufreq_out_of_sync+0x6d/0xb0
    [] cpufreq_update_policy+0x10c/0x160
    [] ? cpufreq_update_policy+0x160/0x160
    [] cpufreq_set_cur_state+0x8c/0xb5
    [] processor_set_cur_state+0xa3/0xcf
    [] thermal_cdev_update+0x9c/0xb0
    [] step_wise_throttle+0x5a/0x90
    [] handle_thermal_trip+0x4f/0x140
    [] thermal_zone_device_update+0x57/0xa0
    [] acpi_thermal_check+0x2e/0x30
    [] acpi_thermal_notify+0x40/0xdc
    [] acpi_device_notify+0x19/0x1b
    [] acpi_ev_notify_dispatch+0x41/0x5c
    [] acpi_os_execute_deferred+0x25/0x32
    [] process_one_work+0x170/0x4a0
    [] worker_thread+0x121/0x390
    [] ? manage_workers.isra.20+0x170/0x170
    [] kthread+0xc0/0xd0
    [] ? flush_kthread_worker+0xb0/0xb0
    [] ret_from_fork+0x7c/0xb0
    [] ? flush_kthread_worker+0xb0/0xb0

    For this reason, revert commit 7c30ed5 along with the fix 266c13d
    (cpufreq: Fix serialization of frequency transitions) on top of it
    and we will revisit the serialization problem later.

    Reported-by: Alessandro Bono
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • There are places where the variable 'ret' is declared as unsigned int
    and then used to store negative return values such as -EINVAL. Fix them
    by declaring the variable as a signed quantity.

    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     
  • Commit "cpufreq: serialize calls to __cpufreq_governor()" had been a temporary
    and partial solution to the race condition between writing to a cpufreq sysfs
    file and taking a CPU offline. Now that we have a proper and complete solution
    to that problem, remove the temporary fix.

    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     
  • The functions that are used to write to cpufreq sysfs files (such as
    store_scaling_max_freq()) are not hotplug safe. They can race with CPU
    hotplug tasks and lead to problems such as trying to acquire an already
    destroyed timer-mutex etc.

    Eg:

    __cpufreq_remove_dev()
    __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
    policy->governor->governor(policy, CPUFREQ_GOV_STOP);
    cpufreq_governor_dbs()
    case CPUFREQ_GOV_STOP:
    mutex_destroy(&cpu_cdbs->timer_mutex)
    cpu_cdbs->cur_policy = NULL;

    store()
    __cpufreq_set_policy()
    __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
    policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
    case CPUFREQ_GOV_LIMITS:
    mutex_lock(&cpu_cdbs->timer_mutex); max < cpu_cdbs->cur_policy->cur)
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     
  • __cpufreq_remove_dev_finish() handles the kobject cleanup for a CPU going
    offline. But because we destroy the kobject towards the end of the CPU offline
    phase, there are certain race windows where a task can try to write to a
    cpufreq sysfs file (eg: using store_scaling_max_freq()) while we are taking
    that CPU offline, and this can bump up the kobject refcount, which in turn might
    hinder the CPU offline task from running to completion. (It can also cause
    other more serious problems such as trying to acquire a destroyed timer-mutex
    etc., depending on the exact stage of the cleanup at which the task managed to
    take a new refcount).

    To fix the race window, we will need to synchronize those store_*() call-sites
    with CPU hotplug, using get_online_cpus()/put_online_cpus(). However, that
    in turn can cause a total deadlock because it can end up waiting for the
    CPU offline task to complete, with incremented refcount!

    Write to sysfs CPU offline task
    -------------- ----------------
    kobj_refcnt++

    Acquire cpu_hotplug.lock

    get_online_cpus();

    Wait for kobj_refcnt to drop to zero

    **DEADLOCK**

    A simple way to avoid this problem is to perform the kobject cleanup in the
    CPU offline path, with the cpu_hotplug.lock *released*. That is, we can
    perform the wait-for-kobj-refcnt-to-drop as well as the subsequent cleanup
    in the CPU_POST_DEAD stage of CPU offline, which is run with cpu_hotplug.lock
    released. Doing this helps us avoid deadlocks due to holding kobject refcounts
    and waiting on each other on the cpu_hotplug.lock.

    (Note: We can't move all of the cpufreq CPU offline steps to the
    CPU_POST_DEAD stage, because certain things such as stopping the governors
    have to be done before the outgoing CPU is marked offline. So retain those
    parts in the CPU_DOWN_PREPARE stage itself).

    Reported-by: Stephen Boyd
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     
  • During CPU offline, the cpufreq core invokes __cpufreq_remove_dev()
    to perform work such as stopping the cpufreq governor, clearing the
    CPU from the policy structure etc, and finally cleaning up the
    kobject.

    There are certain subtle issues related to the kobject cleanup, and
    it would be much easier to deal with them if we separate that part
    from the rest of the cleanup-work in the CPU offline phase. So split
    the __cpufreq_remove_dev() function into 2 parts: one that handles
    the kobject cleanup, and the other that handles the rest of the work.

    Reported-by: Stephen Boyd
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     
  • The time spent by a CPU under a given frequency is stored in jiffies unit
    in the cpu var cpufreq_stats_table->time_in_state[i], i being the index of
    the frequency.

    This is what is displayed in the following file on the right column:

    cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
    2301000 19835820
    2300000 3172
    [...]

    Now cpufreq converts this jiffies unit delta to clock_t before returning it
    to the user as in the above file. And that conversion is achieved using the API
    cputime64_to_clock_t().

    Although it accidentally works on traditional tick based cputime accounting, where
    cputime_t maps directly to jiffies, it doesn't work with other types of cputime
    accounting such as CONFIG_VIRT_CPU_ACCOUNTING_* where cputime_t can map to nsecs
    or any granularity preffered by the architecture.

    For example we get a buggy zero delta on full dyntick configurations:

    cat /sys/devices/system/cpu/cpuX/cpufreq/stats/time_in_state
    2301000 0
    2300000 0
    [...]

    Fix this with using the proper jiffies_64_t to clock_t conversion.

    Reported-and-tested-by: Carsten Emde
    Signed-off-by: Andreas Schwab
    Signed-off-by: Frederic Weisbecker
    Acked-by: Paul E. McKenney
    Signed-off-by: Rafael J. Wysocki

    Andreas Schwab
     
  • We can't take a big lock around __cpufreq_governor() as this causes
    recursive locking for some cases. But calls to this routine must be
    serialized for every policy. Otherwise we can see some unpredictable
    events.

    For example, consider following scenario:

    __cpufreq_remove_dev()
    __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
    policy->governor->governor(policy, CPUFREQ_GOV_STOP);
    cpufreq_governor_dbs()
    case CPUFREQ_GOV_STOP:
    mutex_destroy(&cpu_cdbs->timer_mutex)
    cpu_cdbs->cur_policy = NULL;

    store()
    __cpufreq_set_policy()
    __cpufreq_governor(policy, CPUFREQ_GOV_LIMITS);
    policy->governor->governor(policy, CPUFREQ_GOV_LIMITS);
    case CPUFREQ_GOV_LIMITS:
    mutex_lock(&cpu_cdbs->timer_mutex); max < cpu_cdbs->cur_policy->cur)
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • __cpufreq_governor() returns with -EBUSY when governor is already
    stopped and we try to stop it again, but when it is stopped we must
    not allow calls to CPUFREQ_GOV_LIMITS event as well.

    This patch adds this check in __cpufreq_governor().

    Reported-by: Stephen Boyd
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     

04 Sep, 2013

1 commit

  • Pull ACPI and power management updates from Rafael Wysocki:

    1) ACPI-based PCI hotplug (ACPIPHP) subsystem rework and introduction
    of Intel Thunderbolt support on systems that use ACPI for signalling
    Thunderbolt hotplug events. This also should make ACPIPHP work in
    some cases in which it was known to have problems. From
    Rafael J Wysocki, Mika Westerberg and Kirill A Shutemov.

    2) ACPI core code cleanups and dock station support cleanups from
    Jiang Liu and Rafael J Wysocki.

    3) Fixes for locking problems related to ACPI device hotplug from
    Rafael J Wysocki.

    4) ACPICA update to version 20130725 includig fixes, cleanups, support
    for more than 256 GPEs per GPE block and a change to make the ACPI
    PM Timer optional (we've seen systems without the PM Timer in the
    field already). One of the fixes, related to the DeRefOf operator,
    is necessary to prevent some Windows 8 oriented AML from causing
    problems to happen. From Bob Moore, Lv Zheng, and Jung-uk Kim.

    5) Removal of the old and long deprecated /proc/acpi/event interface
    and related driver changes from Thomas Renninger.

    6) ACPI and Xen changes to make the reduced hardware sleep work with
    the latter from Ben Guthro.

    7) ACPI video driver cleanups and a blacklist of systems that should
    not tell the BIOS that they are compatible with Windows 8 (or ACPI
    backlight and possibly other things will not work on them). From
    Felipe Contreras.

    8) Assorted ACPI fixes and cleanups from Aaron Lu, Hanjun Guo,
    Kuppuswamy Sathyanarayanan, Lan Tianyu, Sachin Kamat, Tang Chen,
    Toshi Kani, and Wei Yongjun.

    9) cpufreq ondemand governor target frequency selection change to
    reduce oscillations between min and max frequencies (essentially,
    it causes the governor to choose target frequencies proportional
    to load) from Stratos Karafotis.

    10) cpufreq fixes allowing sysfs attributes file permissions to be
    preserved over suspend/resume cycles Srivatsa S Bhat.

    11) Removal of Device Tree parsing for CPU device nodes from multiple
    cpufreq drivers that required some changes related to
    of_get_cpu_node() to be made in a few architectures and in the
    driver core. From Sudeep KarkadaNagesha.

    12) cpufreq core fixes and cleanups related to mutual exclusion and
    driver module references from Viresh Kumar, Lukasz Majewski and
    Rafael J Wysocki.

    13) Assorted cpufreq fixes and cleanups from Amit Daniel Kachhap,
    Bartlomiej Zolnierkiewicz, Hanjun Guo, Jingoo Han, Joseph Lo,
    Julia Lawall, Li Zhong, Mark Brown, Sascha Hauer, Stephen Boyd,
    Stratos Karafotis, and Viresh Kumar.

    14) Fixes to prevent race conditions in coupled cpuidle from happening
    from Colin Cross.

    15) cpuidle core fixes and cleanups from Daniel Lezcano and
    Tuukka Tikkanen.

    16) Assorted cpuidle fixes and cleanups from Daniel Lezcano,
    Geert Uytterhoeven, Jingoo Han, Julia Lawall, Linus Walleij,
    and Sahara.

    17) System sleep tracing changes from Todd E Brandt and Shuah Khan.

    18) PNP subsystem conversion to using struct dev_pm_ops for power
    management from Shuah Khan.

    * tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (217 commits)
    cpufreq: Don't use smp_processor_id() in preemptible context
    cpuidle: coupled: fix race condition between pokes and safe state
    cpuidle: coupled: abort idle if pokes are pending
    cpuidle: coupled: disable interrupts after entering safe state
    ACPI / hotplug: Remove containers synchronously
    driver core / ACPI: Avoid device hot remove locking issues
    cpufreq: governor: Fix typos in comments
    cpufreq: governors: Remove duplicate check of target freq in supported range
    cpufreq: Fix timer/workqueue corruption due to double queueing
    ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
    ACPI / thermal: Add check of "_TZD" availability and evaluating result
    cpufreq: imx6q: Fix clock enable balance
    ACPI: blacklist win8 OSI for buggy laptops
    cpufreq: tegra: fix the wrong clock name
    cpuidle: Change struct menu_device field types
    cpuidle: Add a comment warning about possible overflow
    cpuidle: Fix variable domains in get_typical_interval()
    cpuidle: Fix menu_device->intervals type
    cpuidle: CodingStyle: Break up multiple assignments on single line
    cpuidle: Check called function parameter in get_typical_interval()
    ...

    Linus Torvalds
     

01 Sep, 2013

1 commit


30 Aug, 2013

1 commit

  • Workqueues are preemptible even if works are queued on them with
    queue_work_on(). Let's use raw_smp_processor_id() here to silence
    the warning.

    BUG: using smp_processor_id() in preemptible [00000000] code: kworker/3:2/674
    caller is gov_queue_work+0x28/0xb0
    CPU: 0 PID: 674 Comm: kworker/3:2 Tainted: G W 3.10.0 #30
    Workqueue: events od_dbs_timer
    [] (unwind_backtrace+0x0/0x11c) from [] (show_stack+0x10/0x14)
    [] (show_stack+0x10/0x14) from [] (debug_smp_processor_id+0xbc/0xf0)
    [] (debug_smp_processor_id+0xbc/0xf0) from [] (gov_queue_work+0x28/0xb0)
    [] (gov_queue_work+0x28/0xb0) from [] (od_dbs_timer+0x108/0x134)
    [] (od_dbs_timer+0x108/0x134) from [] (process_one_work+0x25c/0x444)
    [] (process_one_work+0x25c/0x444) from [] (worker_thread+0x200/0x344)
    [] (worker_thread+0x200/0x344) from [] (kthread+0xa0/0xb0)
    [] (kthread+0xa0/0xb0) from [] (ret_from_fork+0x14/0x3c)

    Signed-off-by: Stephen Boyd
    Acked-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Stephen Boyd
     

29 Aug, 2013

3 commits

  • - 'Governer' should be 'Governor'.
    - 'S' is used for Siemens (electrical conductance) in SI units,
    so use small 's' for seconds.

    Signed-off-by: Stratos Karafotis
    Acked-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Stratos Karafotis
     
  • Function __cpufreq_driver_target() checks if target_freq is within
    policy->min and policy->max range. generic_powersave_bias_target() also
    checks if target_freq is valid via a cpufreq_frequency_table_target()
    call. So, drop the unnecessary duplicate check in *_check_cpu().

    Signed-off-by: Stratos Karafotis
    Acked-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Stratos Karafotis
     
  • When a CPU is hot removed we'll cancel all the delayed work items
    via gov_cancel_work(). Normally this will just cancels a delayed
    timer on each CPU that the policy is managing and the work won't
    run, but if the work is already running the workqueue code will
    wait for the work to finish before continuing to prevent the
    work items from re-queuing themselves like they normally do. This
    scheme will work most of the time, except for the case where the
    work function determines that it should adjust the delay for all
    other CPUs that the policy is managing. If this scenario occurs,
    the canceling CPU will cancel its own work but queue up the other
    CPUs works to run. For example:

    CPU0 CPU1
    ---- ----
    cpu_down()
    ...
    __cpufreq_remove_dev()
    cpufreq_governor_dbs()
    case CPUFREQ_GOV_STOP:
    gov_cancel_work(dbs_data, policy);
    cpu0 work is canceled
    timer is canceled
    cpu1 work is canceled
    od_dbs_timer()
    gov_queue_work(*, *, true);
    cpu0 work queued
    cpu1 work queued
    cpu2 work queued
    ...
    cpu1 work is canceled
    cpu2 work is canceled
    ...

    At the end of the GOV_STOP case cpu0 still has a work queued to
    run although the code is expecting all of the works to be
    canceled. __cpufreq_remove_dev() will then proceed to
    re-initialize all the other CPUs works except for the CPU that is
    going down. The CPUFREQ_GOV_START case in cpufreq_governor_dbs()
    will trample over the queued work and debugobjects will spit out
    a warning:

    WARNING: at lib/debugobjects.c:260 debug_print_object+0x94/0xbc()
    ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x10
    Modules linked in:
    CPU: 0 PID: 1491 Comm: sh Tainted: G W 3.10.0 #19
    [] (unwind_backtrace+0x0/0x11c) from [] (show_stack+0x10/0x14)
    [] (show_stack+0x10/0x14) from [] (warn_slowpath_common+0x4c/0x6c)
    [] (warn_slowpath_common+0x4c/0x6c) from [] (warn_slowpath_fmt+0x2c/0x3c)
    [] (warn_slowpath_fmt+0x2c/0x3c) from [] (debug_print_object+0x94/0xbc)
    [] (debug_print_object+0x94/0xbc) from [] (__debug_object_init+0x2d0/0x340)
    [] (__debug_object_init+0x2d0/0x340) from [] (init_timer_key+0x14/0xb0)
    [] (init_timer_key+0x14/0xb0) from [] (cpufreq_governor_dbs+0x3e8/0x5f8)
    [] (cpufreq_governor_dbs+0x3e8/0x5f8) from [] (__cpufreq_governor+0xdc/0x1a4)
    [] (__cpufreq_governor+0xdc/0x1a4) from [] (__cpufreq_remove_dev.isra.10+0x3b4/0x434)
    [] (__cpufreq_remove_dev.isra.10+0x3b4/0x434) from [] (cpufreq_cpu_callback+0x60/0x80)
    [] (cpufreq_cpu_callback+0x60/0x80) from [] (notifier_call_chain+0x38/0x68)
    [] (notifier_call_chain+0x38/0x68) from [] (__cpu_notify+0x28/0x40)
    [] (__cpu_notify+0x28/0x40) from [] (_cpu_down+0x7c/0x2c0)
    [] (_cpu_down+0x7c/0x2c0) from [] (cpu_down+0x24/0x40)
    [] (cpu_down+0x24/0x40) from [] (store_online+0x2c/0x74)
    [] (store_online+0x2c/0x74) from [] (dev_attr_store+0x18/0x24)
    [] (dev_attr_store+0x18/0x24) from [] (sysfs_write_file+0x100/0x148)
    [] (sysfs_write_file+0x100/0x148) from [] (vfs_write+0xcc/0x174)
    [] (vfs_write+0xcc/0x174) from [] (SyS_write+0x38/0x64)
    [] (SyS_write+0x38/0x64) from [] (ret_fast_syscall+0x0/0x30)

    Signed-off-by: Stephen Boyd
    Acked-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Stephen Boyd
     

27 Aug, 2013

1 commit


26 Aug, 2013

1 commit

  • For changing the cpu frequency the i.MX6q has to be switched to some
    intermediate clock during the PLL reprogramming. The driver tries
    to be clever to keep the enable count correct but gets it wrong. If
    the cpufreq is increased it calls clk_disable_unprepare twice
    on pll2_pfd2_396m. This puts all other devices which get their clock
    from pll2_pfd2_396m into a nonworking state.

    Fix this by removing the clk enabling/disabling altogether since the
    clk core will do this automatically during a reparent.

    Signed-off-by: Sascha Hauer
    Signed-off-by: Viresh Kumar

    Sascha Hauer
     

24 Aug, 2013

1 commit

  • The "cpu" and "pclk_p_cclk" was a virtual clock name that was used in
    the legacy Tegra clock framework. It was not used after converting to
    CCF. Fix it as the correct clock name that we are using.

    Tested-by: Stephen Warren
    Signed-off-by: Joseph Lo
    Signed-off-by: Viresh Kumar

    Joseph Lo
     

23 Aug, 2013

1 commit

  • Pull DT/core/cpufreq cpu_ofnode updates for v3.12 from Sudeep KarkadaNagesha.

    * 'cpu_of_node' of git://linux-arm.org/linux-skn:
    cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
    cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes
    cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes
    cpufreq: arm_big_little: remove device tree parsing for cpu nodes
    cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes
    cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes
    cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes
    cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
    cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes
    drivers/bus: arm-cci: avoid parsing DT for cpu device nodes
    ARM: mvebu: remove device tree parsing for cpu nodes
    ARM: topology: remove hwid/MPIDR dependency from cpu_capacity
    of/device: add helper to get cpu device node from logical cpu index
    driver/core: cpu: initialize of_node in cpu's device struture
    ARM: DT/kernel: define ARM specific arch_match_cpu_phys_id
    of: move of_get_cpu_node implementation to DT core library
    powerpc: refactor of_get_cpu_node to support other architectures
    openrisc: remove undefined of_get_cpu_node declaration
    microblaze: remove undefined of_get_cpu_node declaration

    Rafael J. Wysocki
     

21 Aug, 2013

10 commits

  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes DT parsing and uses cpu->of_node instead.

    Cc: Benjamin Herrenschmidt
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Cc: Benjamin Herrenschmidt
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Cc: Dmitry Eremin-Solenikov
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Cc: Jason Cooper
    Acked-by: Andrew Lunn
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Cc: Deepak Sikri
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Cc: Mark Langsdorf
    Acked-by: Rob Herring
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Acked-by: Shawn Guo
    Acked-by: Rob Herring
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • Now that the cpu device registration initialises the of_node(if available)
    appropriately for all the cpus, parsing here is redundant.

    This patch removes all DT parsing and uses cpu->of_node instead.

    Acked-by: Shawn Guo
    Acked-by: Viresh Kumar
    Signed-off-by: Sudeep KarkadaNagesha

    Sudeep KarkadaNagesha
     
  • This patch tries to fix lockdep complaint attached below.

    It seems that we should always read acquire the cpufreq_rwsem,
    whether CONFIG_SMP is enabled or not. And CONFIG_HOTPLUG_CPU
    depends on CONFIG_SMP, so it seems we don't need CONFIG_SMP for the
    code enabled by CONFIG_HOTPLUG_CPU.

    [ 0.504191] =====================================
    [ 0.504627] [ BUG: bad unlock balance detected! ]
    [ 0.504627] 3.11.0-rc6-next-20130819 #1 Not tainted
    [ 0.504627] -------------------------------------
    [ 0.504627] swapper/1 is trying to release lock (cpufreq_rwsem) at:
    [ 0.504627] [] cpufreq_add_dev+0x13a/0x3e0
    [ 0.504627] but there are no more locks to release!
    [ 0.504627]
    [ 0.504627] other info that might help us debug this:
    [ 0.504627] 1 lock held by swapper/1:
    [ 0.504627] #0: (subsys mutex#4){+.+.+.}, at: [] subsys_interface_register+0x4f/0xe0
    [ 0.504627]
    [ 0.504627] stack backtrace:
    [ 0.504627] CPU: 0 PID: 1 Comm: swapper Not tainted 3.11.0-rc6-next-20130819 #1
    [ 0.504627] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    [ 0.504627] ffffffff813d927a ffff88007f847c98 ffffffff814c062b ffff88007f847cc8
    [ 0.504627] ffffffff81098bce ffff88007f847cf8 ffffffff81aadc30 ffffffff813d927a
    [ 0.504627] 00000000ffffffff ffff88007f847d68 ffffffff8109d0be 0000000000000006
    [ 0.504627] Call Trace:
    [ 0.504627] [] ? cpufreq_add_dev+0x13a/0x3e0
    [ 0.504627] [] dump_stack+0x19/0x1b
    [ 0.504627] [] print_unlock_imbalance_bug+0xfe/0x110
    [ 0.504627] [] ? cpufreq_add_dev+0x13a/0x3e0
    [ 0.504627] [] lock_release_non_nested+0x1ee/0x310
    [ 0.504627] [] ? mark_held_locks+0xae/0x120
    [ 0.504627] [] ? kfree+0xcb/0x1d0
    [ 0.504627] [] ? cpufreq_policy_free+0x4a/0x60
    [ 0.504627] [] ? cpufreq_add_dev+0x13a/0x3e0
    [ 0.504627] [] lock_release+0xc4/0x250
    [ 0.504627] [] up_read+0x23/0x40
    [ 0.504627] [] cpufreq_add_dev+0x13a/0x3e0
    [ 0.504627] [] subsys_interface_register+0x99/0xe0
    [ 0.504627] [] ? cpufreq_gov_dbs_init+0x12/0x12
    [ 0.504627] [] cpufreq_register_driver+0x9d/0x1d0
    [ 0.504627] [] ? cpufreq_gov_dbs_init+0x12/0x12
    [ 0.504627] [] acpi_cpufreq_init+0xfe/0x1f8
    [ 0.504627] [] do_one_initcall+0xda/0x180
    [ 0.504627] [] kernel_init_freeable+0x12c/0x1bb
    [ 0.504627] [] ? do_early_param+0x8c/0x8c
    [ 0.504627] [] ? rest_init+0x140/0x140
    [ 0.504627] [] kernel_init+0xe/0xf0
    [ 0.504627] [] ret_from_fork+0x7a/0xb0
    [ 0.504627] [] ? rest_init+0x140/0x140

    Signed-off-by: Li Zhong
    Acked-and-tested-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Li Zhong
     

20 Aug, 2013

4 commits

  • To iterate over all policies we currently iterate over all online
    CPUs and then get the policy for each of them which is suboptimal.
    Use the newly created cpufreq_policy_list for this purpose instead.

    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • cpufreq_policy_cpu per-cpu variables are used for storing the ID of
    the CPU that manages the given CPU's policy. However, we also store
    a policy pointer for each cpu in cpufreq_cpu_data, so the
    cpufreq_policy_cpu information is simply redundant.

    It is better to use cpufreq_cpu_data to retrieve a policy and get
    policy->cpu from there, so make that happen everywhere and drop the
    cpufreq_policy_cpu per-cpu variables which aren't necessary any more.

    [rjw: Changelog]
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • We don't need to check if event is CPUFREQ_GOV_POLICY_INIT and put
    governor module as we are sure event can only be START/STOP here.

    Remove the useless check.

    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • cpufreq_policy_list is a list of active policies. We do remove
    policies from this list when all CPUs belonging to that policy are
    removed. But during system suspend we don't really free a policy
    struct as it will be used again during resume, so we didn't remove
    it from cpufreq_policy_list as well..

    However, this is incorrect. We are saying this policy isn't valid
    anymore and must not be referenced (though we haven't freed it), but
    it can still be used by code that iterates over cpufreq_policy_list.

    Remove policy from this list during system suspend as well.
    Of course, we must add it back whenever the first CPU belonging to
    that policy shows up.

    [rjw: Changelog]
    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar