12 Jan, 2012

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (23 commits)
    [CPUFREQ] EXYNOS: Removed useless headers and codes
    [CPUFREQ] EXYNOS: Make EXYNOS common cpufreq driver
    [CPUFREQ] powernow-k8: Update copyright, maintainer and documentation information
    [CPUFREQ] powernow-k8: Fix indexing issue
    [CPUFREQ] powernow-k8: Avoid Pstate MSR accesses on systems supporting CPB
    [CPUFREQ] update lpj only if frequency has changed
    [CPUFREQ] cpufreq:userspace: fix cpu_cur_freq updation
    [CPUFREQ] Remove wall variable from cpufreq_gov_dbs_init()
    [CPUFREQ] EXYNOS4210: cpufreq code is changed for stable working
    [CPUFREQ] EXYNOS4210: Update frequency table for cpu divider
    [CPUFREQ] EXYNOS4210: Remove code about bus on cpufreq
    [CPUFREQ] s3c64xx: Use pr_fmt() for consistent log messages
    cpufreq: OMAP: fixup for omap_device changes, include
    cpufreq: OMAP: fix freq_table leak
    cpufreq: OMAP: put clk if cpu_init failed
    cpufreq: OMAP: only supports OPP library
    cpufreq: OMAP: dont support !freq_table
    cpufreq: OMAP: deny initialization if no mpudev
    cpufreq: OMAP: move clk name decision to init
    cpufreq: OMAP: notify even with bad boot frequency
    ...

    Linus Torvalds
     

06 Jan, 2012

1 commit

  • During scaling up of cpu frequency, loops_per_jiffy
    is updated upon invoking PRECHANGE notifier.
    If setting to new frequency fails in cpufreq driver,
    lpj is left at incorrect value.

    Hence update lpj only if cpu frequency is changed,
    i.e. upon invoking POSTCHANGE notifier.

    Penalty would be that during time period between
    changing cpu frequency & invocation of POSTCHANGE
    notifier, udelay(x) may not gurantee minimal delay
    of 'x' us for frequency scaling up operation.

    Perhaps a better solution would be to define
    CPUFREQ_ABORTCHANGE & handle accordingly, but then
    it would be more intrusive (using ABORTCHANGE may
    help drivers also; if any has registered notifier
    and expect POST for a PRECHANGE, their needs can
    be taken care using ABORT)

    Signed-off-by: Afzal Mohammed
    Signed-off-by: Dave Jones

    Afzal Mohammed
     

22 Dec, 2011

1 commit

  • This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
    and converts the devices to regular devices. The sysdev drivers are
    implemented as subsystem interfaces now.

    After all sysdev classes are ported to regular driver core entities, the
    sysdev implementation will be entirely removed from the kernel.

    Userspace relies on events and generic sysfs subsystem infrastructure
    from sysdev devices, which are made available with this conversion.

    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Cc: "David S. Miller"
    Cc: Chris Metcalf
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Borislav Petkov
    Cc: Tigran Aivazian
    Cc: Len Brown
    Cc: Zhang Rui
    Cc: Dave Jones
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Andrew Morton
    Cc: Arjan van de Ven
    Cc: "Rafael J. Wysocki"
    Cc: "Srivatsa S. Bhat"
    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

29 Jun, 2011

1 commit


04 May, 2011

3 commits

  • Since format string handling is part of request_module, there is no
    need to construct the module name. As such, drop the redundant sprintf
    and heap usage.

    Signed-off-by: Kees Cook
    Signed-off-by: Dave Jones

    Kees Cook
     
  • With dynamic debug having gained the capability to report debug messages
    also during the boot process, it offers a far superior interface for
    debug messages than the custom cpufreq infrastructure. As a first step,
    remove the old cpufreq_debug_printk() function and replace it with a call
    to the generic pr_debug() function.

    How can dynamic debug be used on cpufreq? You need a kernel which has
    CONFIG_DYNAMIC_DEBUG enabled.

    To enabled debugging during runtime, mount debugfs and

    $ echo -n 'module cpufreq +p' > /sys/kernel/debug/dynamic_debug/control

    for debugging the complete "cpufreq" module. To achieve the same goal during
    boot, append

    ddebug_query="module cpufreq +p"

    as a boot parameter to the kernel of your choice.

    For more detailled instructions, please see
    Documentation/dynamic-debug-howto.txt

    Signed-off-by: Dominik Brodowski
    Signed-off-by: Dave Jones

    Dominik Brodowski
     
  • When we discover CPUs that are affected by each other's
    frequency/voltage transitions, the first CPU gets a sysfs directory
    created, and rest of the siblings get symlinks. Currently, when we
    hotplug off only the first CPU, all of the symlinks and the sysfs
    directory gets removed. Even though rest of the siblings are still
    online and functional, they are orphaned, and no longer governed by
    cpufreq.

    This patch, given the above scenario, creates a sysfs directory for
    the first sibling and symlinks for the rest of the siblings.

    Please note the recursive call, it was rather too ugly to roll it
    out. And the removal of redundant NULL setting (it is already taken
    care of near the top of the function).

    Signed-off-by: Jacob Shin
    Acked-by: Mark Langsdorf
    Reviewed-by: Thomas Renninger
    Signed-off-by: Dave Jones
    Cc: stable@kernel.org

    Jacob Shin
     

31 Mar, 2011

1 commit


24 Mar, 2011

1 commit

  • The cpufreq subsystem uses sysdev suspend and resume for
    executing cpufreq_suspend() and cpufreq_resume(), respectively,
    during system suspend, after interrupts have been switched off on the
    boot CPU, and during system resume, while interrupts are still off on
    the boot CPU. In both cases the other CPUs are off-line at the
    relevant point (either they have been switched off via CPU hotplug
    during suspend, or they haven't been switched on yet during resume).
    For this reason, although it may seem that cpufreq_suspend() and
    cpufreq_resume() are executed for all CPUs in the system, they are
    only called for the boot CPU in fact, which is quite confusing.

    To remove the confusion and to prepare for elimiating sysdev
    suspend and resume operations from the kernel enirely, convernt
    cpufreq to using a struct syscore_ops object for the boot CPU
    suspend and resume and rename the callbacks so that their names
    reflect their purpose. In addition, put some explanatory remarks
    into their kerneldoc comments.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

17 Mar, 2011

1 commit


02 Mar, 2011

1 commit

  • cpufreq_register_driver sets cpufreq_driver to a structure owned (and
    placed) in the caller's memory. If cpufreq policy fails in its ->init
    function, sysdev_driver_register returns nonzero in
    cpufreq_register_driver. Now, cpufreq_register_driver returns an error
    without setting cpufreq_driver back to NULL.

    Usually cpufreq policy modules are unloaded because they propagate the
    error to the module init function and return that.

    So a later access to any member of cpufreq_driver causes bugs like:
    BUG: unable to handle kernel paging request at ffffffffa00270a0
    IP: [] cpufreq_cpu_get+0x53/0xe0
    PGD 1805067 PUD 1809063 PMD 1c3f90067 PTE 0
    Oops: 0000 [#1] SMP
    last sysfs file: /sys/devices/virtual/net/tun0/statistics/collisions
    CPU 0
    Modules linked in: ...
    Pid: 5677, comm: thunderbird-bin Tainted: G W 2.6.38-rc4-mm1_64+ #1389 To be filled by O.E.M./To Be Filled By O.E.M.
    RIP: 0010:[] [] cpufreq_cpu_get+0x53/0xe0
    RSP: 0018:ffff8801aec37d98 EFLAGS: 00010086
    RAX: 0000000000000202 RBX: 0000000000000000 RCX: 0000000000000001
    RDX: ffffffffa00270a0 RSI: 0000000000001000 RDI: ffffffff8199ece8
    ...
    Call Trace:
    [] cpufreq_quick_get+0x10/0x30
    [] show_cpuinfo+0x2ab/0x300
    [] seq_read+0xf2/0x3f0
    [] ? __strncpy_from_user+0x33/0x60
    [] proc_reg_read+0x6d/0xa0
    [] vfs_read+0xc3/0x180
    [] sys_read+0x4c/0x90
    [] system_call_fastpath+0x16/0x1b
    ...

    It's all cause by weird fail path handling in cpufreq_register_driver.
    To fix that, shuffle the code to do proper handling with gotos.

    Signed-off-by: Jiri Slaby
    Signed-off-by: Dave Jones

    Jiri Slaby
     

04 Jan, 2011

1 commit

  • Add these new power trace events:

    power:cpu_idle
    power:cpu_frequency
    power:machine_suspend

    The old C-state/idle accounting events:
    power:power_start
    power:power_end

    Have now a replacement (but we are still keeping the old
    tracepoints for compatibility):

    power:cpu_idle

    and
    power:power_frequency

    is replaced with:
    power:cpu_frequency

    power:machine_suspend is newly introduced.

    Jean Pihet has a patch integrated into the generic layer
    (kernel/power/suspend.c) which will make use of it.

    the type= field got removed from both, it was never
    used and the type is differed by the event type itself.

    perf timechart userspace tool gets adjusted in a separate patch.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Ingo Molnar
    Acked-by: Arjan van de Ven
    Acked-by: Jean Pihet
    Cc: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: rjw@sisk.pl
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    LKML-Reference:

    Thomas Renninger
     

22 Oct, 2010

1 commit

  • Indent the body of for_each_cpu.

    The semantic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @r disable braces4@
    position p1,p2;
    statement S1,S2;
    @@

    (
    if (...) { ... }
    |
    if (...) S1@p1 S2@p2
    )

    @script:python@
    p1 << r.p1;
    p2 << r.p2;
    @@

    if (p1[0].column == p2[0].column):
    cocci.print_main("branch",p1)
    cocci.print_secs("after",p2)
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: Dave Jones

    Julia Lawall
     

04 Aug, 2010

5 commits

  • This patch fixes up a brace warning found by the checkpatch.pl tool

    Signed-off-by: Neal Buckendahl
    Signed-off-by: Dave Jones

    Neal Buckendahl
     
  • and fix the broken case if a core's frequency depends on others.

    trace_power_frequency was only implemented in a rather ungeneric way
    in acpi-cpufreq driver's target() function only.
    -> Move the call to trace_power_frequency to
    cpufreq.c:cpufreq_notify_transition() where CPUFREQ_POSTCHANGE
    notifier is triggered.
    This will support power frequency tracing by all cpufreq drivers

    trace_power_frequency did not trace frequency changes correctly when
    the userspace governor was used or when CPU cores' frequency depend
    on each other.
    -> Moving this into the CPUFREQ_POSTCHANGE notifier and pass the cpu
    which gets switched automatically fixes this.

    Robert Schoene provided some important fixes on top of my initial
    quick shot version which are integrated in this patch:
    - Forgot some changes in power_end trace (TP_printk/variable names)
    - Variable dummy in power_end must now be cpu_id
    - Use static 64 bit variable instead of unsigned int for cpu_id

    Signed-off-by: Thomas Renninger
    CC: davej@redhat.com
    CC: arjan@infradead.org
    CC: linux-kernel@vger.kernel.org
    CC: robert.schoene@tu-dresden.de
    Tested-by: robert.schoene@tu-dresden.de
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • lock_policy_rwsem_* and unlock_policy_rwsem_* functions are scheduled
    to be unexported when 2.6.33. Now there are no other callers of them
    out of cpufreq.c, unexport them and make them static.

    Signed-off-by: WANG Cong
    Cc: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    Amerigo Wang
     
  • We didn't free policy->related_cpus in error path err_unlock_policy.
    This is catched by following kmemleak report:

    unreferenced object 0xffff88022a0b96d0 (size 512):
    comm "modprobe", pid 886, jiffies 4294689177 (age 780.694s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] create_object+0x186/0x281
    [] kmemleak_alloc+0x60/0xa7
    [] kmem_cache_alloc_node_notrace+0x120/0x142
    [] alloc_cpumask_var_node+0x2c/0xd7
    [] alloc_cpumask_var+0x11/0x13
    [] zalloc_cpumask_var+0xf/0x11
    [] cpufreq_add_dev+0x11f/0x547
    [] sysdev_driver_register+0xc2/0x11d
    [] cpufreq_register_driver+0xcb/0x1b8
    [] 0xffffffffa032e040
    [] do_one_initcall+0x5e/0x15c
    [] sys_init_module+0xa6/0x1e6
    [] system_call_fastpath+0x16/0x1b
    [] 0xffffffffffffffff

    Signed-off-by: Xiaotian Feng
    Cc: Thomas Renninger
    Cc: Prarit Bhargava
    Signed-off-by: Dave Jones

    Xiaotian Feng
     
  • 395913d0b1db37092ea3d9d69b832183b1dd84c5 ("[CPUFREQ] remove rwsem lock
    from CPUFREQ_GOV_STOP call (second call site)") is not needed, because
    there is no rwsem lock in cpufreq_ondemand and cpufreq_conservative
    anymore. Lock should not be released until the work done.

    Addresses https://bugzilla.kernel.org/show_bug.cgi?id=1594

    Signed-off-by: Andrej Gelenberg
    Cc: Mathieu Desnoyers
    Cc: Venkatesh Pallipadi
    Signed-off-by: Andrew Morton
    Acked-by: Mathieu Desnoyers
    Signed-off-by: Dave Jones

    Andrej Gelenberg
     

09 May, 2010

1 commit


10 Apr, 2010

1 commit

  • Multiple modules used to define those which are with identical
    functionality and were needlessly replicated among the different cpufreq
    drivers. Push them into the header and remove duplication.

    Signed-off-by: Borislav Petkov
    LKML-Reference:
    Reviewed-by: Thomas Renninger
    Signed-off-by: H. Peter Anvin

    Borislav Petkov
     

01 Apr, 2010

1 commit

  • There is no need to do sysfs_remove_link() or kobject_put() etc.
    when policy_rwsem_write is held, move them after releasing the lock.

    This fixes the lockdep warning:

    halt/4071 is trying to acquire lock:
    (s_active){++++.+}, at: [] .sysfs_addrm_finish+0x58/0xc0

    but task is already holding lock:
    (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [] .lock_policy_rwsem_write+0x84/0xf4

    Reported-by: Benjamin Herrenschmidt
    Signed-off-by: WANG Cong
    Cc: Johannes Berg
    Cc: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    Amerigo Wang
     

08 Mar, 2010

1 commit

  • Constify struct sysfs_ops.

    This is part of the ops structure constification
    effort started by Arjan van de Ven et al.

    Benefits of this constification:

    * prevents modification of data that is shared
    (referenced) by many other structure instances
    at runtime

    * detects/prevents accidental (but not intentional)
    modification attempts on archs that enforce
    read-only kernel data at runtime

    * potentially better optimized code as the compiler
    can assume that the const data cannot be changed

    * the compiler/linker move const data into .rodata
    and therefore exclude them from false sharing

    Signed-off-by: Emese Revfy
    Acked-by: David Teigland
    Acked-by: Matt Domsch
    Acked-by: Maciej Sosnowski
    Acked-by: Hans J. Koch
    Acked-by: Pekka Enberg
    Acked-by: Jens Axboe
    Acked-by: Stephen Hemminger
    Signed-off-by: Greg Kroah-Hartman

    Emese Revfy
     

15 Dec, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
    m68k: rename global variable vmalloc_end to m68k_vmalloc_end
    percpu: add missing per_cpu_ptr_to_phys() definition for UP
    percpu: Fix kdump failure if booted with percpu_alloc=page
    percpu: make misc percpu symbols unique
    percpu: make percpu symbols in ia64 unique
    percpu: make percpu symbols in powerpc unique
    percpu: make percpu symbols in x86 unique
    percpu: make percpu symbols in xen unique
    percpu: make percpu symbols in cpufreq unique
    percpu: make percpu symbols in oprofile unique
    percpu: make percpu symbols in tracer unique
    percpu: make percpu symbols under kernel/ and mm/ unique
    percpu: remove some sparse warnings
    percpu: make alloc_percpu() handle array types
    vmalloc: fix use of non-existent percpu variable in put_cpu_var()
    this_cpu: Use this_cpu_xx in trace_functions_graph.c
    this_cpu: Use this_cpu_xx for ftrace
    this_cpu: Use this_cpu_xx in nmi handling
    this_cpu: Use this_cpu operations in RCU
    this_cpu: Use this_cpu ops for VM statistics
    ...

    Fix up trivial (famous last words) global per-cpu naming conflicts in
    arch/x86/kvm/svm.c
    mm/slab.c

    Linus Torvalds
     

25 Nov, 2009

2 commits

  • This interface is mainly intended (and implemented) for ACPI _PPC BIOS
    frequency limitations, but other cpufreq drivers can also use it for
    similar use-cases.

    Why is this needed:

    Currently it's not obvious why cpufreq got limited.
    People see cpufreq/scaling_max_freq reduced, but this could have
    happened by:
    - any userspace prog writing to scaling_max_freq
    - thermal limitations
    - hardware (_PPC in ACPI case) limitiations

    Therefore export bios_limit (in kHz) to:
    - Point the user that it's the BIOS (broken or intended) which limits
    frequency
    - Export it as a sysfs interface for userspace progs.
    While this was a rarely used feature on laptops, there will appear
    more and more server implemenations providing "Green IT" features like
    allowing the service processor to limit the frequency. People want
    to know about HW/BIOS frequency limitations.

    All ACPI P-state driven cpufreq drivers are covered with this patch:
    - powernow-k8
    - powernow-k7
    - acpi-cpufreq

    Tested with a patched DSDT which limits the first two cores (_PPC returns 1)
    via _PPC, exposed by bios_limit:
    # echo 2200000 >cpu2/cpufreq/scaling_max_freq
    # cat cpu*/cpufreq/scaling_max_freq
    2600000
    2600000
    2200000
    2200000
    # #scaling_max_freq shows general user/thermal/BIOS limitations

    # cat cpu*/cpufreq/bios_limit
    2600000
    2600000
    2800000
    2800000
    # #bios_limit only shows the HW/BIOS limitation

    CC: Pallipadi Venkatesh
    CC: Len Brown
    CC: davej@codemonkey.org.uk
    CC: linux@dominikbrodowski.net

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • No need to export these symbols; make them static.

    cpufreq_add_dev_policy
    cpufreq_add_dev_symlink
    cpufreq_add_dev_interface

    Signed-off-by: Alex Chiang
    Signed-off-by: Dave Jones

    Alex Chiang
     

18 Nov, 2009

2 commits

  • Dave,

    Attached is an update of my patch against the cpufreq fixes branch.

    Before applying the patch I compiled and booted the tree to see if the panic
    was still there -- to my surprise it was not. This is because of the conversion
    of cpufreq_cpu_governor to a char[].

    While the panic is kaput, the problem of stale data continues and my patch is
    still valid. It is possible to end up with the wrong governor after hotplug
    events because CPUFREQ_DEFAULT_GOVERNOR is statically linked to a default,
    while the cpu siblings may have had a different governor assigned by a user.

    ie) the patch is still needed in order to keep the governors assigned
    properly when hotplugging devices

    Signed-off-by: Prarit Bhargava
    Signed-off-by: Dave Jones

    Prarit Bhargava
     
  • Currently on governer backup/restore path we storing governor's pointer.
    This is wrong because one may unload governor's module after cpu goes
    offline. As result use-after-free will take place on restored cpu.
    It is not easy to exploit this bug, but still we have to close this
    issue ASAP. Issue was introduced by following commit
    084f34939424161669467c19280dbcf637730314

    ##TESTCASE##
    #!/bin/sh -x
    modprobe acpi_cpufreq
    # Any non default governor, in may case it is "ondemand"
    modprobe cpufreq_ondemand
    echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
    rmmod acpi_cpufreq
    rmmod cpufreq_ondemand
    modprobe acpi_cpufreq # << use-after-free here.

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Dave Jones

    Dmitry Monakhov
     

29 Oct, 2009

1 commit

  • This patch updates percpu related symbols in cpufreq such that percpu
    symbols are unique and don't clash with local symbols. This serves
    two purposes of decreasing the possibility of global percpu symbol
    collision and allowing dropping per_cpu__ prefix from percpu symbols.

    * drivers/cpufreq/cpufreq.c: s/policy_cpu/cpufreq_policy_cpu/
    * drivers/cpufreq/freq_table.c: s/show_table/cpufreq_show_table/
    * arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: s/drv_data/acfreq_data/
    s/old_perf/acfreq_old_perf/

    Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
    which cause name clashes" patch.

    Signed-off-by: Tejun Heo
    Cc: Rusty Russell

    Tejun Heo
     

02 Sep, 2009

9 commits

  • remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)

    commit 42a06f2166f2f6f7bf04f32b4e823eacdceafdc9

    Missed a call site for CPUFREQ_GOV_STOP to remove the rwlock taken around the
    teardown. To make a long story short, the rwlock write-lock causes a circular
    dependency with cancel_delayed_work_sync(), because the timer handler takes the
    read lock.

    Note that all callers to __cpufreq_set_policy are taking the rwsem. All sysfs
    callers (writers) hold the write rwsem at the earliest sysfs calling stage.

    However, the rwlock write-lock is not needed upon governor stop.

    Signed-off-by: Mathieu Desnoyers
    Acked-by: Venkatesh Pallipadi
    CC: rjw@sisk.pl
    CC: mingo@elte.hu
    CC: Shaohua Li
    CC: Pekka Enberg
    CC: Dave Young
    CC: "Rafael J. Wysocki"
    CC: Rusty Russell
    CC: trenn@suse.de
    CC: sven.wegener@stealer.net
    CC: cpufreq@vger.kernel.org
    Signed-off-by: Dave Jones

    Mathieu Desnoyers
     
  • Currently everything in the cpufreq layer is per core based.
    This does not reflect reality, for example ondemand on conservative
    governors have global sysfs variables.

    Introduce a global cpufreq directory and add the kobject to the governor
    struct, so that governors can easily access it.
    The directory is initialized in the cpufreq_core_init initcall and thus will
    always be created if cpufreq is compiled in, even if no cpufreq driver is
    active later.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • Doing:
    echo 0 >cpu1/online
    echo 1 >cpu1/online

    on a managed CPU will result in:
    Jul 22 15:15:37 linux kernel: [ 80.013864] WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xcf/0xe6()
    Jul 22 15:15:37 linux kernel: [ 80.013866] Hardware name: To Be Filled By O.E.M.
    Jul 22 15:15:37 linux kernel: [ 80.013868] sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
    Jul 22 15:15:37 linux kernel: [ 80.013870] Modules linked in: powernow_k8
    Jul 22 15:15:37 linux kernel: [ 80.013874] Pid: 5750, comm: bash Not tainted 2.6.31-rc2 #40
    Jul 22 15:15:37 linux kernel: [ 80.013876] Call Trace:
    Jul 22 15:15:37 linux kernel: [ 80.013879] [] ? sysfs_add_one+0xcf/0xe6
    Jul 22 15:15:37 linux kernel: [ 80.013884] [] warn_slowpath_common+0x77/0xa4
    Jul 22 15:15:37 linux kernel: [ 80.013888] [] warn_slowpath_fmt+0x3c/0x3e
    Jul 22 15:15:37 linux kernel: [ 80.013891] [] sysfs_add_one+0xcf/0xe6
    Jul 22 15:15:37 linux kernel: [ 80.013894] [] create_dir+0x58/0x87
    Jul 22 15:15:37 linux kernel: [ 80.013898] [] sysfs_create_dir+0x38/0x4f
    Jul 22 15:15:37 linux kernel: [ 80.013902] [] kobject_add_internal+0x11f/0x1de
    Jul 22 15:15:37 linux kernel: [ 80.013905] [] kobject_add_varg+0x41/0x4e
    Jul 22 15:15:37 linux kernel: [ 80.013908] [] kobject_init_and_add+0x4c/0x57
    Jul 22 15:15:37 linux kernel: [ 80.013913] [] ? mark_lock+0x22/0x228
    Jul 22 15:15:37 linux kernel: [ 80.013918] [] cpufreq_add_dev_interface+0x40/0x1e4
    ...

    This bug slipped in by git commit:
    150b06f7f223cfd0f808737a5243cceca8ea47fa

    When splitting up cpufreq_add_dev, the whole cpufreq_add_dev function
    is not left anymore, only cpufreq_add_dev_policy.
    This patch should reconstruct the identical functionality again as it
    was before the split.

    CC: Venkatesh Pallipadi
    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Signed-off-by: Dave Jones

    Dave Jones
     
  • Commit 4bc5d3413503 is broken and causes regressions:

    (1) cpufreq_driver->resume() and ->suspend() were only called on
    __powerpc__, but you could set them on all architectures. In fact,
    ->resume() was defined and used before the PPC-related commit
    42d4dc3f4e1e complained about in 4bc5d3413503.

    (2) Therfore, the resume functions in acpi_cpufreq and speedstep-smi
    would never be called.

    (3) This means speedstep-smi would be unusuable after suspend or resume.

    The _real_ problem was calling cpufreq_driver->get() with interrupts
    off, but it re-enabling interrupts on some platforms. Why is ->get()
    necessary?

    Some systems like to change the CPU frequency behind our
    back, especially during BIOS-intensive operations like suspend or
    resume. If such systems also use a CPU frequency-dependant timing loop,
    delays might be off by large factors. Therefore, we need to ascertain
    as soon as possible that the CPU frequency is indeed at the speed we
    think it is. You can do this two ways: either setting it anew, or trying
    to get it. The latter is what was done, the former also has the same IRQ
    issue.

    So, let's try something different: defer the checking to after interrupts
    are re-enabled, by calling cpufreq_update_policy() (via schedule_work()).
    Timings may be off until this later stage, so let's watch out for
    resume regressions caused by the deferred handling of frequency changes
    behind the kernel's back.

    Signed-off-by: Dominik Brodowski
    Signed-off-by: Dave Jones

    Dominik Brodowski
     

05 Aug, 2009

3 commits

  • The suspend code runs with interrupts disabled, and the powerpc workaround we
    do in the cpufreq suspend hook calls the drivers ->get method.

    powernow-k8's ->get does an smp_call_function_single
    which needs interrupts enabled

    cpufreq's suspend/resume code was added in 42d4dc3f4e1e to work around
    a hardware problem on ppc powerbooks. If we make all this code
    conditional on powerpc, we avoid the issue above.

    Signed-off-by: Dave Jones

    Dave Jones
     
  • The first offline/online cycle is successful, the second not.
    Doing:
    echo 0 >cpu1/online
    echo 1 >cpu1/online
    echo 0 >cpu1/online

    The last command will trigger:
    Jul 22 14:39:50 linux kernel: [ 593.210125] ------------[ cut here ]------------
    Jul 22 14:39:50 linux kernel: [ 593.210139] WARNING: at lib/kref.c:43 kref_get+0x23/0x2b()
    Jul 22 14:39:50 linux kernel: [ 593.210144] Hardware name: To Be Filled By O.E.M.
    Jul 22 14:39:50 linux kernel: [ 593.210148] Modules linked in: powernow_k8
    Jul 22 14:39:50 linux kernel: [ 593.210158] Pid: 378, comm: kondemand/2 Tainted: G W 2.6.31-rc2 #38
    Jul 22 14:39:50 linux kernel: [ 593.210163] Call Trace:
    Jul 22 14:39:50 linux kernel: [ 593.210171] [] ? kref_get+0x23/0x2b
    Jul 22 14:39:50 linux kernel: [ 593.210181] [] warn_slowpath_common+0x77/0xa4
    Jul 22 14:39:50 linux kernel: [ 593.210190] [] warn_slowpath_null+0xf/0x11
    Jul 22 14:39:50 linux kernel: [ 593.210198] [] kref_get+0x23/0x2b
    Jul 22 14:39:50 linux kernel: [ 593.210206] [] kobject_get+0x1a/0x22
    Jul 22 14:39:50 linux kernel: [ 593.210214] [] cpufreq_cpu_get+0x8a/0xcb
    Jul 22 14:39:50 linux kernel: [ 593.210222] [] __cpufreq_driver_getavg+0x1d/0x67
    Jul 22 14:39:50 linux kernel: [ 593.210231] [] do_dbs_timer+0x158/0x27f
    Jul 22 14:39:50 linux kernel: [ 593.210240] [] worker_thread+0x200/0x313
    ...

    The output continues on every do_dbs_timer ondemand freq checking poll.
    This regression was introduced by git commit:
    3f4a782b5ce2698b1870b5a7b573cd721d4fce33

    The policy is released when the cpufreq device is removed in:
    __cpufreq_remove_dev():
    /* if this isn't the CPU which is the parent of the kobj, we
    * only need to unlink, put and exit
    */

    Not creating the symlink is not sever at all.
    As long as:
    sysfs_remove_link(&sys_dev->kobj, "cpufreq");
    handles it gracefully that the symlink did not exist.
    Possibly no error should be returned at all, because ondemand
    governor would still provide the same functionality.
    Userspace in userspace gov case might be confused if the link
    is missing.

    Resolves http://bugzilla.kernel.org/show_bug.cgi?id=13903

    CC: Mathieu Desnoyers
    CC: Venkatesh Pallipadi
    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • Suspend/Resume fails on multi socket, multi core systems because the cpufreq
    code erroneously sets the per_cpu policy_cpu value when a logical cpu is
    offline.

    This most notably results in missing sysfs files that are used to set the
    cpu frequencies of the various cpus.

    Signed-off-by: Prarit Bhargava
    Signed-off-by: Dave Jones

    Prarit Bhargava