19 Sep, 2009

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
    [CPUFREQ] Fix NULL ptr regression in powernow-k8
    [CPUFREQ] Create a blacklist for processors that should not load the acpi-cpufreq module.
    [CPUFREQ] Powernow-k8: Enable more than 2 low P-states
    [CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)
    [CPUFREQ] ondemand - Use global sysfs dir for tuning settings
    [CPUFREQ] Introduce global, not per core: /sys/devices/system/cpu/cpufreq
    [CPUFREQ] Bail out of cpufreq_add_dev if the link for a managed CPU got created
    [CPUFREQ] Factor out policy setting from cpufreq_add_dev
    [CPUFREQ] Factor out interface creation from cpufreq_add_dev
    [CPUFREQ] Factor out symlink creation from cpufreq_add_dev
    [CPUFREQ] cleanup up -ENOMEM handling in cpufreq_add_dev
    [CPUFREQ] Reduce scope of cpu_sys_dev in cpufreq_add_dev
    [CPUFREQ] update Doc for cpuinfo_cur_freq and scaling_cur_freq

    Linus Torvalds
     

16 Sep, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits)
    powerpc64: convert to dynamic percpu allocator
    sparc64: use embedding percpu first chunk allocator
    percpu: kill lpage first chunk allocator
    x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA
    percpu: update embedding first chunk allocator to handle sparse units
    percpu: use group information to allocate vmap areas sparsely
    vmalloc: implement pcpu_get_vm_areas()
    vmalloc: separate out insert_vmalloc_vm()
    percpu: add chunk->base_addr
    percpu: add pcpu_unit_offsets[]
    percpu: introduce pcpu_alloc_info and pcpu_group_info
    percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward
    percpu: add @align to pcpu_fc_alloc_fn_t
    percpu: make @dyn_size mandatory for pcpu_setup_first_chunk()
    percpu: drop @static_size from first chunk allocators
    percpu: generalize first chunk allocator selection
    percpu: build first chunk allocators selectively
    percpu: rename 4k first chunk allocator to page
    percpu: improve boot messages
    percpu: fix pcpu_reclaim() locking
    ...

    Fix trivial conflict as by Tejun Heo in kernel/sched.c

    Linus Torvalds
     

02 Sep, 2009

10 commits

  • remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)

    commit 42a06f2166f2f6f7bf04f32b4e823eacdceafdc9

    Missed a call site for CPUFREQ_GOV_STOP to remove the rwlock taken around the
    teardown. To make a long story short, the rwlock write-lock causes a circular
    dependency with cancel_delayed_work_sync(), because the timer handler takes the
    read lock.

    Note that all callers to __cpufreq_set_policy are taking the rwsem. All sysfs
    callers (writers) hold the write rwsem at the earliest sysfs calling stage.

    However, the rwlock write-lock is not needed upon governor stop.

    Signed-off-by: Mathieu Desnoyers
    Acked-by: Venkatesh Pallipadi
    CC: rjw@sisk.pl
    CC: mingo@elte.hu
    CC: Shaohua Li
    CC: Pekka Enberg
    CC: Dave Young
    CC: "Rafael J. Wysocki"
    CC: Rusty Russell
    CC: trenn@suse.de
    CC: sven.wegener@stealer.net
    CC: cpufreq@vger.kernel.org
    Signed-off-by: Dave Jones

    Mathieu Desnoyers
     
  • Ondemand has only global variables for userspace tunings via sysfs.
    But they were exposed per CPU which wrongly implies to the user that
    his settings are applied per cpu. Also locking sysfs against concurrent
    access won't be necessary anymore after deprecation time.

    This means the ondemand config dir is moved:
    /sys/devices/system/cpu/cpu*/cpufreq/ondemand ->
    /sys/devices/system/cpu/cpufreq/ondemand

    The old files will still exist, but reading or writing to them will
    result in one (printk_once) deprecation msg to syslog per file.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • Currently everything in the cpufreq layer is per core based.
    This does not reflect reality, for example ondemand on conservative
    governors have global sysfs variables.

    Introduce a global cpufreq directory and add the kobject to the governor
    struct, so that governors can easily access it.
    The directory is initialized in the cpufreq_core_init initcall and thus will
    always be created if cpufreq is compiled in, even if no cpufreq driver is
    active later.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • Doing:
    echo 0 >cpu1/online
    echo 1 >cpu1/online

    on a managed CPU will result in:
    Jul 22 15:15:37 linux kernel: [ 80.013864] WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xcf/0xe6()
    Jul 22 15:15:37 linux kernel: [ 80.013866] Hardware name: To Be Filled By O.E.M.
    Jul 22 15:15:37 linux kernel: [ 80.013868] sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
    Jul 22 15:15:37 linux kernel: [ 80.013870] Modules linked in: powernow_k8
    Jul 22 15:15:37 linux kernel: [ 80.013874] Pid: 5750, comm: bash Not tainted 2.6.31-rc2 #40
    Jul 22 15:15:37 linux kernel: [ 80.013876] Call Trace:
    Jul 22 15:15:37 linux kernel: [ 80.013879] [] ? sysfs_add_one+0xcf/0xe6
    Jul 22 15:15:37 linux kernel: [ 80.013884] [] warn_slowpath_common+0x77/0xa4
    Jul 22 15:15:37 linux kernel: [ 80.013888] [] warn_slowpath_fmt+0x3c/0x3e
    Jul 22 15:15:37 linux kernel: [ 80.013891] [] sysfs_add_one+0xcf/0xe6
    Jul 22 15:15:37 linux kernel: [ 80.013894] [] create_dir+0x58/0x87
    Jul 22 15:15:37 linux kernel: [ 80.013898] [] sysfs_create_dir+0x38/0x4f
    Jul 22 15:15:37 linux kernel: [ 80.013902] [] kobject_add_internal+0x11f/0x1de
    Jul 22 15:15:37 linux kernel: [ 80.013905] [] kobject_add_varg+0x41/0x4e
    Jul 22 15:15:37 linux kernel: [ 80.013908] [] kobject_init_and_add+0x4c/0x57
    Jul 22 15:15:37 linux kernel: [ 80.013913] [] ? mark_lock+0x22/0x228
    Jul 22 15:15:37 linux kernel: [ 80.013918] [] cpufreq_add_dev_interface+0x40/0x1e4
    ...

    This bug slipped in by git commit:
    150b06f7f223cfd0f808737a5243cceca8ea47fa

    When splitting up cpufreq_add_dev, the whole cpufreq_add_dev function
    is not left anymore, only cpufreq_add_dev_policy.
    This patch should reconstruct the identical functionality again as it
    was before the split.

    CC: Venkatesh Pallipadi
    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Commit 4bc5d3413503 is broken and causes regressions:

    (1) cpufreq_driver->resume() and ->suspend() were only called on
    __powerpc__, but you could set them on all architectures. In fact,
    ->resume() was defined and used before the PPC-related commit
    42d4dc3f4e1e complained about in 4bc5d3413503.

    (2) Therfore, the resume functions in acpi_cpufreq and speedstep-smi
    would never be called.

    (3) This means speedstep-smi would be unusuable after suspend or resume.

    The _real_ problem was calling cpufreq_driver->get() with interrupts
    off, but it re-enabling interrupts on some platforms. Why is ->get()
    necessary?

    Some systems like to change the CPU frequency behind our
    back, especially during BIOS-intensive operations like suspend or
    resume. If such systems also use a CPU frequency-dependant timing loop,
    delays might be off by large factors. Therefore, we need to ascertain
    as soon as possible that the CPU frequency is indeed at the speed we
    think it is. You can do this two ways: either setting it anew, or trying
    to get it. The latter is what was done, the former also has the same IRQ
    issue.

    So, let's try something different: defer the checking to after interrupts
    are re-enabled, by calling cpufreq_update_policy() (via schedule_work()).
    Timings may be off until this later stage, so let's watch out for
    resume regressions caused by the deferred handling of frequency changes
    behind the kernel's back.

    Signed-off-by: Dominik Brodowski
    Signed-off-by: Dave Jones

    Dominik Brodowski
     

14 Aug, 2009

1 commit

  • Conflicts:
    arch/sparc/kernel/smp_64.c
    arch/x86/kernel/cpu/perf_counter.c
    arch/x86/kernel/setup_percpu.c
    drivers/cpufreq/cpufreq_ondemand.c
    mm/percpu.c

    Conflicts in core and arch percpu codes are mostly from commit
    ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
    num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all
    the first chunk allocators into mm/percpu.c, the changes are moved
    from arch code to mm/percpu.c.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

05 Aug, 2009

4 commits

  • The suspend code runs with interrupts disabled, and the powerpc workaround we
    do in the cpufreq suspend hook calls the drivers ->get method.

    powernow-k8's ->get does an smp_call_function_single
    which needs interrupts enabled

    cpufreq's suspend/resume code was added in 42d4dc3f4e1e to work around
    a hardware problem on ppc powerbooks. If we make all this code
    conditional on powerpc, we avoid the issue above.

    Signed-off-by: Dave Jones

    Dave Jones
     
  • The first offline/online cycle is successful, the second not.
    Doing:
    echo 0 >cpu1/online
    echo 1 >cpu1/online
    echo 0 >cpu1/online

    The last command will trigger:
    Jul 22 14:39:50 linux kernel: [ 593.210125] ------------[ cut here ]------------
    Jul 22 14:39:50 linux kernel: [ 593.210139] WARNING: at lib/kref.c:43 kref_get+0x23/0x2b()
    Jul 22 14:39:50 linux kernel: [ 593.210144] Hardware name: To Be Filled By O.E.M.
    Jul 22 14:39:50 linux kernel: [ 593.210148] Modules linked in: powernow_k8
    Jul 22 14:39:50 linux kernel: [ 593.210158] Pid: 378, comm: kondemand/2 Tainted: G W 2.6.31-rc2 #38
    Jul 22 14:39:50 linux kernel: [ 593.210163] Call Trace:
    Jul 22 14:39:50 linux kernel: [ 593.210171] [] ? kref_get+0x23/0x2b
    Jul 22 14:39:50 linux kernel: [ 593.210181] [] warn_slowpath_common+0x77/0xa4
    Jul 22 14:39:50 linux kernel: [ 593.210190] [] warn_slowpath_null+0xf/0x11
    Jul 22 14:39:50 linux kernel: [ 593.210198] [] kref_get+0x23/0x2b
    Jul 22 14:39:50 linux kernel: [ 593.210206] [] kobject_get+0x1a/0x22
    Jul 22 14:39:50 linux kernel: [ 593.210214] [] cpufreq_cpu_get+0x8a/0xcb
    Jul 22 14:39:50 linux kernel: [ 593.210222] [] __cpufreq_driver_getavg+0x1d/0x67
    Jul 22 14:39:50 linux kernel: [ 593.210231] [] do_dbs_timer+0x158/0x27f
    Jul 22 14:39:50 linux kernel: [ 593.210240] [] worker_thread+0x200/0x313
    ...

    The output continues on every do_dbs_timer ondemand freq checking poll.
    This regression was introduced by git commit:
    3f4a782b5ce2698b1870b5a7b573cd721d4fce33

    The policy is released when the cpufreq device is removed in:
    __cpufreq_remove_dev():
    /* if this isn't the CPU which is the parent of the kobj, we
    * only need to unlink, put and exit
    */

    Not creating the symlink is not sever at all.
    As long as:
    sysfs_remove_link(&sys_dev->kobj, "cpufreq");
    handles it gracefully that the symlink did not exist.
    Possibly no error should be returned at all, because ondemand
    governor would still provide the same functionality.
    Userspace in userspace gov case might be confused if the link
    is missing.

    Resolves http://bugzilla.kernel.org/show_bug.cgi?id=13903

    CC: Mathieu Desnoyers
    CC: Venkatesh Pallipadi
    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • Suspend/Resume fails on multi socket, multi core systems because the cpufreq
    code erroneously sets the per_cpu policy_cpu value when a logical cpu is
    offline.

    This most notably results in missing sysfs files that are used to set the
    cpu frequencies of the various cpus.

    Signed-off-by: Prarit Bhargava
    Signed-off-by: Dave Jones

    Prarit Bhargava
     
  • Commit ee88415caf736b89500f16e0a545614541a45005
    introduced this regression when it removed enable bit in cpu_dbs_info_s.
    That added a possibility of dbs_cpufreq_notifier getting called for a
    CPU that is not yet managed by conservative governor. That will happen
    as the transition notifier is set as soon as one CPU switches to
    conservative governor and other CPUs can get a NULL pointer dereference
    without the enable bit check. Add the enable bit back again.

    Reported-by: Lermytte Christophe
    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    Pallipadi, Venkatesh
     

09 Jul, 2009

1 commit


07 Jul, 2009

4 commits

  • OK, I've tried to clean it up the best I could, but please test this with
    concurrent cpu hotplug and cpufreq add/remove in loops. I'm sure we will make
    other interesting findings.

    This is step one of fixing the overall locking dependency mess in cpufreq.

    Signed-off-by: Mathieu Desnoyers
    CC: Venkatesh Pallipadi
    CC: rjw@sisk.pl
    CC: mingo@elte.hu
    CC: Shaohua Li
    CC: Pekka Enberg
    CC: Dave Young
    CC: "Rafael J. Wysocki"
    CC: Rusty Russell
    CC: sven.wegener@stealer.net
    CC: cpufreq@vger.kernel.org
    CC: Thomas Renninger
    Signed-off-by: Dave Jones

    Mathieu Desnoyers
     
  • Redesign the locking inside conservative driver. Make dbs_mutex handle all the
    global state changes inside the driver and invent a new percpu mutex
    to serialize percpu timer and frequency limit change.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    venkatesh.pallipadi@intel.com
     
  • Redesign the locking inside ondemand driver. Make dbs_mutex handle all the
    global state changes inside the driver and invent a new percpu mutex
    to serialize percpu timer and frequency limit change.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    venkatesh.pallipadi@intel.com
     
  • Commit b14893a62c73af0eca414cfed505b8c09efc613c although it was very
    much needed to properly cleanup ondemand timer, opened-up a can of worms
    related to locking dependencies in cpufreq.

    Patch here defines the need for dbs_mutex and cleans up its usage in
    ondemand governor. This also resolves the lockdep warnings reported here

    http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html
    http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html

    and few others..

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    venkatesh.pallipadi@intel.com
     

24 Jun, 2009

1 commit

  • Percpu variable definition is about to be updated such that all percpu
    symbols including the static ones must be unique. Update percpu
    variable definitions accordingly.

    * as,cfq: rename ioc_count uniquely

    * cpufreq: rename cpu_dbs_info uniquely

    * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it

    * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and
    rename it

    * ipv4,6: rename cookie_scratch uniquely

    * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to
    pmc_irq_entry and nmi_entry to pmc_nmi_entry

    * perf_counter: rename disable_count to perf_disable_count

    * ftrace: rename test_event_disable to ftrace_test_event_disable

    * kmemleak: rename test_pointer to kmemleak_test_pointer

    * mce: rename next_interval to mce_next_interval

    [ Impact: percpu usage cleanups, no duplicate static percpu var names ]

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Cc: Ivan Kokshaysky
    Cc: Jens Axboe
    Cc: Dave Jones
    Cc: Jeremy Fitzhardinge
    Cc: linux-mm
    Cc: David S. Miller
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Li Zefan
    Cc: Catalin Marinas
    Cc: Andi Kleen

    Tejun Heo
     

15 Jun, 2009

2 commits

  • Update the documentation accordingly.
    Cleanup and use printk_once.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • With this patch you have following minimal sampling rate restrictions:

    Kernel restrictions:
    If CONFIG_NO_HZ is set, the limit is 10ms fixed.
    If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the
    limits depend on the CONFIG_HZ option:
    HZ=1000: min=20000us (20ms)
    HZ=250: min=80000us (80ms)
    HZ=100: min=200000us (200ms)

    HW restrictions:
    Do not sample/poll more often than HW latency * 100 exported by the low
    level cpufreq HW driver

    The higher value of above restrictions is the minimal sampling rate
    that can be set (and can be seen via ondemand/sampling_rate_min sysfs file)

    Default sampling rate still is HW latency * 1000, but this will now end
    up in lower values on latest (Intel and AMD) hardware as these can switch
    really fast and sampling rate mostly was limited to the 80ms or 200ms
    (depending on whether HZ=250 or HZ=1000 is used).

    Signed-off-by: Thomas Renninger
    Cc: Pallipadi Venkatesh
    Signed-off-by: Dave Jones

    Thomas Renninger
     

09 Jun, 2009

1 commit


27 May, 2009

3 commits

  • * Rafael J. Wysocki (rjw@sisk.pl) wrote:
    > This message has been generated automatically as a part of a report
    > of regressions introduced between 2.6.28 and 2.6.29.
    >
    > The following bug entry is on the current list of known regressions
    > introduced between 2.6.28 and 2.6.29. Please verify if it still should
    > be listed and let me know (either way).
    >
    >
    > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
    > Subject : cpufreq timer teardown problem
    > Submitter : Mathieu Desnoyers
    > Date : 2009-04-23 14:00 (24 days old)
    > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
    > Handled-By : Mathieu Desnoyers
    > Patch : http://patchwork.kernel.org/patch/19754/
    > http://patchwork.kernel.org/patch/19753/
    >

    (updated changelog)

    cpufreq fix timer teardown in ondemand governor

    The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
    use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
    workqueue handler to exit.

    The ondemand governor does not seem to be affected because the
    "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
    immediately without rescheduling the work. The conservative governor in
    2.6.30-rc has the same check as the ondemand governor, which makes things
    usually run smoothly. However, if the governor is quickly stopped and then
    started, this could lead to the following race :

    dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
    This is why a synchronized teardown is required.

    The following patch applies to, at least, 2.6.28.x, 2.6.29.1, 2.6.30-rc2.

    Depends on patch
    cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call

    Signed-off-by: Mathieu Desnoyers
    CC: Andrew Morton
    CC: gregkh@suse.de
    CC: stable@kernel.org
    CC: cpufreq@vger.kernel.org
    CC: Ingo Molnar
    CC: rjw@sisk.pl
    CC: Ben Slusky
    Signed-off-by: Dave Jones

    Mathieu Desnoyers
     
  • * Rafael J. Wysocki (rjw@sisk.pl) wrote:
    > This message has been generated automatically as a part of a report
    > of regressions introduced between 2.6.28 and 2.6.29.
    >
    > The following bug entry is on the current list of known regressions
    > introduced between 2.6.28 and 2.6.29. Please verify if it still should
    > be listed and let me know (either way).
    >
    >
    > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
    > Subject : cpufreq timer teardown problem
    > Submitter : Mathieu Desnoyers
    > Date : 2009-04-23 14:00 (24 days old)
    > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
    > Handled-By : Mathieu Desnoyers
    > Patch : http://patchwork.kernel.org/patch/19754/
    > http://patchwork.kernel.org/patch/19753/
    >

    (re-send with updated changelog)

    cpufreq fix timer teardown in conservative governor

    The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
    use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
    workqueue handler to exit.

    The ondemand governor does not seem to be affected because the
    "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
    immediately without rescheduling the work. The conservative governor in
    2.6.30-rc has the same check as the ondemand governor, which makes things
    usually run smoothly. However, if the governor is quickly stopped and then
    started, this could lead to the following race :

    dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
    This is why a synchronized teardown is required.

    Depends on patch
    cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call

    The following patch applies to 2.6.30-rc2. Stable kernels have a similar
    issue which should also be fixed, but the code changed between 2.6.29
    and 2.6.30, so this patch only applies to 2.6.30-rc.

    Signed-off-by: Mathieu Desnoyers
    CC: Andrew Morton
    CC: gregkh@suse.de
    CC: stable@kernel.org
    CC: cpufreq@vger.kernel.org
    CC: Ingo Molnar
    CC: rjw@sisk.pl
    CC: Ben Slusky
    Signed-off-by: Dave Jones

    Mathieu Desnoyers
     
  • * Rafael J. Wysocki (rjw@sisk.pl) wrote:
    > This message has been generated automatically as a part of a report
    > of regressions introduced between 2.6.28 and 2.6.29.
    >
    > The following bug entry is on the current list of known regressions
    > introduced between 2.6.28 and 2.6.29. Please verify if it still should
    > be listed and let me know (either way).
    >
    >
    > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
    > Subject : cpufreq timer teardown problem
    > Submitter : Mathieu Desnoyers
    > Date : 2009-04-23 14:00 (24 days old)
    > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
    > Handled-By : Mathieu Desnoyers
    > Patch : http://patchwork.kernel.org/patch/19754/
    > http://patchwork.kernel.org/patch/19753/

    The patches linked above depend on the following patch to remove
    circular locking dependency :

    cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call

    (the following issue was faced when using cancel_delayed_work_sync() in the
    timer teardown (which fixes a race).

    * KOSAKI Motohiro (kosaki.motohiro@jp.fujitsu.com) wrote:
    > Hi
    >
    > my box output following warnings.
    > it seems regression by commit 7ccc7608b836e58fbacf65ee4f8eefa288e86fac.
    >
    > A: work -> do_dbs_timer() -> cpu_policy_rwsem
    > B: store() -> cpu_policy_rwsem -> cpufreq_governor_dbs() -> work
    >
    >

    Hrm, I think it must be due to my attempt to fix the timer teardown race
    in ondemand governor mixed with new locking behavior in 2.6.30-rc.

    The rwlock seems to be taken around the whole call to
    cpufreq_governor_dbs(), when it should be only taken around accesses to
    the locked data, and especially *not* around the call to
    dbs_timer_exit().

    Reverting my fix attempt would put the teardown race back in place
    (replacing the cancel_delayed_work_sync by cancel_delayed_work).
    Instead, a proper fix would imply modifying this critical section :

    cpufreq.c: __cpufreq_remove_dev()
    ...
    if (cpufreq_driver->target)
    __cpufreq_governor(data, CPUFREQ_GOV_STOP);

    unlock_policy_rwsem_write(cpu);

    To make sure the __cpufreq_governor() callback is not called with rwsem
    held. This would allow execution of cancel_delayed_work_sync() without
    being nested within the rwsem.

    Applies on top of the 2.6.30-rc5 tree.

    Required to remove circular dep in teardown of both conservative and
    ondemande governors so they can use cancel_delayed_work_sync().
    CPUFREQ_GOV_STOP does not modify the policy, therefore this locking seemed
    unneeded.

    Signed-off-by: Mathieu Desnoyers
    CC: KOSAKI Motohiro
    Cc: Greg KH
    CC: Ingo Molnar
    CC: "Rafael J. Wysocki"
    CC: Ben Slusky
    CC: Chris Wright
    CC: Andrew Morton
    Signed-off-by: Dave Jones

    Mathieu Desnoyers
     

27 Mar, 2009

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (35 commits)
    [CPUFREQ] Prevent p4-clockmod from auto-binding to the ondemand governor.
    [CPUFREQ] Make cpufreq-nforce2 less obnoxious
    [CPUFREQ] p4-clockmod reports wrong frequency.
    [CPUFREQ] powernow-k8: Use a common exit path.
    [CPUFREQ] Change link order of x86 cpufreq modules
    [CPUFREQ] conservative: remove 10x from def_sampling_rate
    [CPUFREQ] conservative: fixup governor to function more like ondemand logic
    [CPUFREQ] conservative: fix dbs_cpufreq_notifier so freq is not locked
    [CPUFREQ] conservative: amend author's email address
    [CPUFREQ] Use swap() in longhaul.c
    [CPUFREQ] checkpatch cleanups for acpi-cpufreq
    [CPUFREQ] powernow-k8: Only print error message once, not per core.
    [CPUFREQ] ondemand/conservative: sanitize sampling_rate restrictions
    [CPUFREQ] ondemand/conservative: deprecate sampling_rate{min,max}
    [CPUFREQ] powernow-k8: Always compile powernow-k8 driver with ACPI support
    [CPUFREQ] Introduce /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_transition_latency
    [CPUFREQ] checkpatch cleanups for powernow-k8
    [CPUFREQ] checkpatch cleanups for ondemand governor.
    [CPUFREQ] checkpatch cleanups for powernow-k7
    [CPUFREQ] checkpatch cleanups for speedstep related drivers.
    ...

    Linus Torvalds
     

10 Mar, 2009

1 commit

  • This reverts commit e088e4c9cdb618675874becb91b2fd581ee707e6.

    Removing the sysfs interface for p4-clockmod was flagged as a
    regression in bug 12826.

    Course of action:
    - Find out the remaining causes of overheating, and fix them
    if possible. ACPI should be doing the right thing automatically.
    If it isn't, we need to fix that.
    - mark p4-clockmod ui as deprecated
    - try again with the removal in six months.

    It's not really feasible to printk about the deprecation, because
    it needs to happen at all the sysfs entry points, which means adding
    a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.

    Signed-off-by: Dave Jones

    Dave Jones
     

25 Feb, 2009

9 commits