12 Jan, 2012

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: (23 commits)
    [CPUFREQ] EXYNOS: Removed useless headers and codes
    [CPUFREQ] EXYNOS: Make EXYNOS common cpufreq driver
    [CPUFREQ] powernow-k8: Update copyright, maintainer and documentation information
    [CPUFREQ] powernow-k8: Fix indexing issue
    [CPUFREQ] powernow-k8: Avoid Pstate MSR accesses on systems supporting CPB
    [CPUFREQ] update lpj only if frequency has changed
    [CPUFREQ] cpufreq:userspace: fix cpu_cur_freq updation
    [CPUFREQ] Remove wall variable from cpufreq_gov_dbs_init()
    [CPUFREQ] EXYNOS4210: cpufreq code is changed for stable working
    [CPUFREQ] EXYNOS4210: Update frequency table for cpu divider
    [CPUFREQ] EXYNOS4210: Remove code about bus on cpufreq
    [CPUFREQ] s3c64xx: Use pr_fmt() for consistent log messages
    cpufreq: OMAP: fixup for omap_device changes, include
    cpufreq: OMAP: fix freq_table leak
    cpufreq: OMAP: put clk if cpu_init failed
    cpufreq: OMAP: only supports OPP library
    cpufreq: OMAP: dont support !freq_table
    cpufreq: OMAP: deny initialization if no mpudev
    cpufreq: OMAP: move clk name decision to init
    cpufreq: OMAP: notify even with bad boot frequency
    ...

    Linus Torvalds
     

20 Dec, 2011

1 commit


15 Dec, 2011

1 commit


09 Dec, 2011

1 commit

  • CPUFREQ Remove wall variable from cpufreq_gov_dbs_init()

    Remove wall variable from cpufreq_gov_dbs_init() as
    get_cpu_idle_time_us() no longer updates the last_update_time
    unconditionally. Passing non-NULL last_update_time address
    will result in accounting additional idle time with
    update_ts_time_stats() before returning idle_sleeptime.

    Signed-off-by: Kamalesh Babulal
    Signed-off-by: Dave Jones
    --
    drivers/cpufreq/cpufreq_ondemand.c | 3 +--
    1 files changed, 1 insertions(+), 2 deletions(-)

    Kamalesh Babulal
     

06 Dec, 2011

1 commit

  • This patch changes fields in cpustat from a structure, to an
    u64 array. Math gets easier, and the code is more flexible.

    Signed-off-by: Glauber Costa
    Reviewed-by: KAMEZAWA Hiroyuki
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Paul Tuner
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1322498719-2255-2-git-send-email-glommer@parallels.com
    Signed-off-by: Ingo Molnar

    Glauber Costa
     

26 Oct, 2011

1 commit

  • * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    time, s390: Get rid of compile warning
    dw_apb_timer: constify clocksource name
    time: Cleanup old CONFIG_GENERIC_TIME references that snuck in
    time: Change jiffies_to_clock_t() argument type to unsigned long
    alarmtimers: Fix error handling
    clocksource: Make watchdog reset lockless
    posix-cpu-timers: Cure SMP accounting oddities
    s390: Use direct ktime path for s390 clockevent device
    clockevents: Add direct ktime programming function
    clockevents: Make minimum delay adjustments configurable
    nohz: Remove "Switched to NOHz mode" debugging messages
    proc: Consider NO_HZ when printing idle and iowait times
    nohz: Make idle/iowait counter update conditional
    nohz: Fix update_ts_time_stat idle accounting
    cputime: Clean up cputime_to_usecs and usecs_to_cputime macros
    alarmtimers: Rework RTC device selection using class interface
    alarmtimers: Add try_to_cancel functionality
    alarmtimers: Add more refined alarm state tracking
    alarmtimers: Remove period from alarm structure
    alarmtimers: Remove interval cap limit hack
    ...

    Linus Torvalds
     

08 Sep, 2011

1 commit

  • update_ts_time_stat currently updates idle time even if we are in
    iowait loop at the moment. The only real users of the idle counter
    (via get_cpu_idle_time_us) are CPU governors and they expect to get
    cumulative time for both idle and iowait times.
    The value (idle_sleeptime) is also printed to userspace by print_cpu
    but it prints both idle and iowait times so the idle part is misleading.

    Let's clean this up and fix update_ts_time_stat to account both counters
    properly and update consumers of idle to consider iowait time as well.
    If we do this we might use get_cpu_{idle,iowait}_time_us from other
    contexts as well and we will get expected values.

    Signed-off-by: Michal Hocko
    Cc: Dave Jones
    Cc: Arnd Bergmann
    Cc: Alexey Dobriyan
    Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
    Signed-off-by: Thomas Gleixner

    Michal Hocko
     

09 Aug, 2011

1 commit


17 Mar, 2011

4 commits


26 Jan, 2011

1 commit

  • With cmwq, there's no reason for cpufreq drivers to use separate
    workqueues. Remove the dedicated workqueues from cpufreq_conservative
    and cpufreq_ondemand and use system_wq instead. The work items are
    already sync canceled on stop, so it's already guaranteed that no work
    is running on module exit.

    Signed-off-by: Tejun Heo
    Acked-by: Dave Jones
    Cc: cpufreq@vger.kernel.org

    Tejun Heo
     

22 Oct, 2010

1 commit

  • Adds a new global tunable, sampling_down_factor. Set to 1 it makes no
    changes from existing behavior, but set to greater than 1 (e.g. 100)
    it acts as a multiplier for the scheduling interval for reevaluating
    load when the CPU is at its top speed due to high load. This improves
    performance by reducing the overhead of load evaluation and helping
    the CPU stay at its top speed when truly busy, rather than shifting
    back and forth in speed. This tunable has no effect on behavior at
    lower speeds/lower CPU loads.

    This patch is against 2.6.36-rc6.

    This patch should help solve kernel bug 19672 "ondemand is slow".

    Signed-off-by: David Niemi
    Acked-by: Venkatesh Pallipadi
    CC: Daniel Hollocher
    CC:
    CC:
    Signed-off-by: Dave Jones

    David C Niemi
     

04 Aug, 2010

2 commits


18 May, 2010

1 commit

  • * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, hypervisor: add missing
    Modify the VMware balloon driver for the new x86_hyper API
    x86, hypervisor: Export the x86_hyper* symbols
    x86: Clean up the hypervisor layer
    x86, HyperV: fix up the license to mshyperv.c
    x86: Detect running on a Microsoft HyperV system
    x86, cpu: Make APERF/MPERF a normal table-driven flag
    x86, k8: Fix build error when K8_NB is disabled
    x86, cacheinfo: Disable index in all four subcaches
    x86, cacheinfo: Make L3 cache info per node
    x86, cacheinfo: Reorganize AMD L3 cache structure
    x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments
    x86, cacheinfo: Unify AMD L3 cache index disable checking
    cpufreq: Unify sysfs attribute definition macros
    powernow-k8: Fix frequency reporting
    x86, cpufreq: Add APERF/MPERF support for AMD processors
    x86: Unify APERF/MPERF support
    powernow-k8: Add core performance boost support
    x86, cpu: Add AMD core boosting feature flag to /proc/cpuinfo

    Fix up trivial conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c and
    drivers/cpufreq/cpufreq_ondemand.c

    Linus Torvalds
     

10 May, 2010

2 commits

  • Pavel Machek pointed out that not all CPUs have an efficient
    idle at high frequency. Specifically, older Intel and various
    AMD cpus would get a higher powerusage when copying files from
    USB.

    Mike Chan pointed out that the same is true for various ARM
    chips as well.

    Thomas Renninger suggested to make this a sysfs tunable with a
    reasonable default.

    This patch adds a sysfs tunable for the new behavior, and uses
    a very simple function to determine a reasonable default,
    depending on the CPU vendor/type.

    Signed-off-by: Arjan van de Ven
    Acked-by: Rik van Riel
    Acked-by: Pavel Machek
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    [ minor tidyup ]
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • The ondemand cpufreq governor uses CPU busy time (e.g. not-idle
    time) as a measure for scaling the CPU frequency up or down.
    If the CPU is busy, the CPU frequency scales up, if it's idle,
    the CPU frequency scales down. Effectively, it uses the CPU busy
    time as proxy variable for the more nebulous "how critical is
    performance right now" question.

    This algorithm falls flat on its face in the light of workloads
    where you're alternatingly disk and CPU bound, such as the ever
    popular "git grep", but also things like startup of programs and
    maildir using email clients... much to the chagarin of Andrew
    Morton.

    This patch changes the ondemand algorithm to count iowait time
    as busy, not idle, time. As shown in the breakdown cases above,
    iowait is performance critical often, and by counting iowait,
    the proxy variable becomes a more accurate representation of the
    "how critical is performance" question.

    The problem and fix are both verified with the "perf timechar"
    tool.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Dave Jones
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     

10 Apr, 2010

1 commit

  • Multiple modules used to define those which are with identical
    functionality and were needlessly replicated among the different cpufreq
    drivers. Push them into the header and remove duplication.

    Signed-off-by: Borislav Petkov
    LKML-Reference:
    Reviewed-by: Thomas Renninger
    Signed-off-by: H. Peter Anvin

    Borislav Petkov
     

13 Jan, 2010

1 commit

  • Dominik said:
    target_freq cannot be below policy->min or above policy->max.
    If it were, the whole cpufreq subsystem is broken.

    But (answer):
    I think the "ondemand" governor can ask for a target frequency that is
    below policy->min.
    ...
    A patch such as below may be needed to sanitize the target frequency
    requested by "ondemand". The "conservative" governor already has this check:

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    # diff -bur x/drivers/cpufreq/cpufreq_ondemand.c.orig y/drivers/cpufreq/cpufreq_ondemand.c

    Nagananda.Chumbalkar@hp.com
     

18 Nov, 2009

1 commit

  • ondemand and conservative governors are messing up time units in the
    code path where NO_HZ is not enabled and ignore_nice is set. The walltime
    idletime stored is in jiffies and nice time calculation is happening in
    microseconds.

    The problem was reported and diagnosed by Alexander here.
    http://marc.info/?l=linux-kernel&m=125752550404513&w=2

    The patch below fixes this thinko.

    Reported-by: Alexander Miller
    Tested-by: Alexander Miller
    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    Pallipadi, Venkatesh
     

19 Sep, 2009

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
    [CPUFREQ] Fix NULL ptr regression in powernow-k8
    [CPUFREQ] Create a blacklist for processors that should not load the acpi-cpufreq module.
    [CPUFREQ] Powernow-k8: Enable more than 2 low P-states
    [CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)
    [CPUFREQ] ondemand - Use global sysfs dir for tuning settings
    [CPUFREQ] Introduce global, not per core: /sys/devices/system/cpu/cpufreq
    [CPUFREQ] Bail out of cpufreq_add_dev if the link for a managed CPU got created
    [CPUFREQ] Factor out policy setting from cpufreq_add_dev
    [CPUFREQ] Factor out interface creation from cpufreq_add_dev
    [CPUFREQ] Factor out symlink creation from cpufreq_add_dev
    [CPUFREQ] cleanup up -ENOMEM handling in cpufreq_add_dev
    [CPUFREQ] Reduce scope of cpu_sys_dev in cpufreq_add_dev
    [CPUFREQ] update Doc for cpuinfo_cur_freq and scaling_cur_freq

    Linus Torvalds
     

02 Sep, 2009

1 commit

  • Ondemand has only global variables for userspace tunings via sysfs.
    But they were exposed per CPU which wrongly implies to the user that
    his settings are applied per cpu. Also locking sysfs against concurrent
    access won't be necessary anymore after deprecation time.

    This means the ondemand config dir is moved:
    /sys/devices/system/cpu/cpu*/cpufreq/ondemand ->
    /sys/devices/system/cpu/cpufreq/ondemand

    The old files will still exist, but reading or writing to them will
    result in one (printk_once) deprecation msg to syslog per file.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     

14 Aug, 2009

1 commit

  • Conflicts:
    arch/sparc/kernel/smp_64.c
    arch/x86/kernel/cpu/perf_counter.c
    arch/x86/kernel/setup_percpu.c
    drivers/cpufreq/cpufreq_ondemand.c
    mm/percpu.c

    Conflicts in core and arch percpu codes are mostly from commit
    ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
    num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all
    the first chunk allocators into mm/percpu.c, the changes are moved
    from arch code to mm/percpu.c.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

07 Jul, 2009

2 commits

  • Redesign the locking inside ondemand driver. Make dbs_mutex handle all the
    global state changes inside the driver and invent a new percpu mutex
    to serialize percpu timer and frequency limit change.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    venkatesh.pallipadi@intel.com
     
  • Commit b14893a62c73af0eca414cfed505b8c09efc613c although it was very
    much needed to properly cleanup ondemand timer, opened-up a can of worms
    related to locking dependencies in cpufreq.

    Patch here defines the need for dbs_mutex and cleans up its usage in
    ondemand governor. This also resolves the lockdep warnings reported here

    http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/01925.html
    http://lkml.indiana.edu/hypermail/linux/kernel/0907.0/00820.html

    and few others..

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    venkatesh.pallipadi@intel.com
     

24 Jun, 2009

1 commit

  • Percpu variable definition is about to be updated such that all percpu
    symbols including the static ones must be unique. Update percpu
    variable definitions accordingly.

    * as,cfq: rename ioc_count uniquely

    * cpufreq: rename cpu_dbs_info uniquely

    * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it

    * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and
    rename it

    * ipv4,6: rename cookie_scratch uniquely

    * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to
    pmc_irq_entry and nmi_entry to pmc_nmi_entry

    * perf_counter: rename disable_count to perf_disable_count

    * ftrace: rename test_event_disable to ftrace_test_event_disable

    * kmemleak: rename test_pointer to kmemleak_test_pointer

    * mce: rename next_interval to mce_next_interval

    [ Impact: percpu usage cleanups, no duplicate static percpu var names ]

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Cc: Ivan Kokshaysky
    Cc: Jens Axboe
    Cc: Dave Jones
    Cc: Jeremy Fitzhardinge
    Cc: linux-mm
    Cc: David S. Miller
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Li Zefan
    Cc: Catalin Marinas
    Cc: Andi Kleen

    Tejun Heo
     

15 Jun, 2009

2 commits

  • Update the documentation accordingly.
    Cleanup and use printk_once.

    Signed-off-by: Thomas Renninger
    Signed-off-by: Dave Jones

    Thomas Renninger
     
  • With this patch you have following minimal sampling rate restrictions:

    Kernel restrictions:
    If CONFIG_NO_HZ is set, the limit is 10ms fixed.
    If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the
    limits depend on the CONFIG_HZ option:
    HZ=1000: min=20000us (20ms)
    HZ=250: min=80000us (80ms)
    HZ=100: min=200000us (200ms)

    HW restrictions:
    Do not sample/poll more often than HW latency * 100 exported by the low
    level cpufreq HW driver

    The higher value of above restrictions is the minimal sampling rate
    that can be set (and can be seen via ondemand/sampling_rate_min sysfs file)

    Default sampling rate still is HW latency * 1000, but this will now end
    up in lower values on latest (Intel and AMD) hardware as these can switch
    really fast and sampling rate mostly was limited to the 80ms or 200ms
    (depending on whether HZ=250 or HZ=1000 is used).

    Signed-off-by: Thomas Renninger
    Cc: Pallipadi Venkatesh
    Signed-off-by: Dave Jones

    Thomas Renninger
     

27 May, 2009

1 commit

  • * Rafael J. Wysocki (rjw@sisk.pl) wrote:
    > This message has been generated automatically as a part of a report
    > of regressions introduced between 2.6.28 and 2.6.29.
    >
    > The following bug entry is on the current list of known regressions
    > introduced between 2.6.28 and 2.6.29. Please verify if it still should
    > be listed and let me know (either way).
    >
    >
    > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13186
    > Subject : cpufreq timer teardown problem
    > Submitter : Mathieu Desnoyers
    > Date : 2009-04-23 14:00 (24 days old)
    > References : http://marc.info/?l=linux-kernel&m=124049523515036&w=4
    > Handled-By : Mathieu Desnoyers
    > Patch : http://patchwork.kernel.org/patch/19754/
    > http://patchwork.kernel.org/patch/19753/
    >

    (updated changelog)

    cpufreq fix timer teardown in ondemand governor

    The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
    use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
    workqueue handler to exit.

    The ondemand governor does not seem to be affected because the
    "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
    immediately without rescheduling the work. The conservative governor in
    2.6.30-rc has the same check as the ondemand governor, which makes things
    usually run smoothly. However, if the governor is quickly stopped and then
    started, this could lead to the following race :

    dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
    This is why a synchronized teardown is required.

    The following patch applies to, at least, 2.6.28.x, 2.6.29.1, 2.6.30-rc2.

    Depends on patch
    cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call

    Signed-off-by: Mathieu Desnoyers
    CC: Andrew Morton
    CC: gregkh@suse.de
    CC: stable@kernel.org
    CC: cpufreq@vger.kernel.org
    CC: Ingo Molnar
    CC: rjw@sisk.pl
    CC: Ben Slusky
    Signed-off-by: Dave Jones

    Mathieu Desnoyers
     

25 Feb, 2009

3 commits


06 Feb, 2009

1 commit


06 Jan, 2009

1 commit

  • Impact: use new cpumask API to reduce memory usage

    This is part of an effort to reduce structure sizes for machines
    configured with large NR_CPUS. cpumask_t gets replaced by
    cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or
    struct cpumask * (large NR_CPUS).

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Acked-by: Dave Jones
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

10 Oct, 2008

4 commits

  • Use get_cpu()/put_cpu() in cpufreq_ondemand init routine, instead of
    smp_processor_id() to avoid the following BUG:

    [ 35.313118] BUG: using smp_processor_id() in preemptible [00000000] code=: modprobe/4952
    [ 35.313132] caller is cpufreq_gov_dbs_init+0xa/0x8f [cpufreq_ondemand]
    [ 35.313140] Pid: 4952, comm: modprobe Not tainted 2.6.27-rc5-mm1 #23
    [ 35.313145] Call Trace:
    [ 35.313158] [] debug_smp_processor_id+0xd7/0xe0
    [ 35.313167] [] cpufreq_gov_dbs_init+0xa/0x8f [cpufreq_ondemand]
    [ 35.313176] [] _stext+0x3b/0x160
    [ 35.313185] [] __mutex_unlock_slowpath+0xe5/0x190
    [ 35.313195] [] trace_hardirqs_on_caller+0xca/0x140
    [ 35.313205] [] sys_init_module+0xdc/0x210
    [ 35.313212] [] system_call_fastpath+0x16/0x1b

    Signed-off-by: Andrea Righi
    Signed-off-by: Dave Jones

    Andrea Righi
     
  • We don't need to export the governors for use as the default governor,
    because the default governor will be built-in anyway and we can access
    the symbol directly.

    This also fixes the following sparse warnings:

    drivers/cpufreq/cpufreq_conservative.c:578:25: warning: symbol 'cpufreq_gov_conservative' was not declared. Should it be static?
    drivers/cpufreq/cpufreq_ondemand.c:582:25: warning: symbol 'cpufreq_gov_ondemand' was not declared. Should it be static?
    drivers/cpufreq/cpufreq_performance.c:39:25: warning: symbol 'cpufreq_gov_performance' was not declared. Should it be static?
    drivers/cpufreq/cpufreq_powersave.c:38:25: warning: symbol 'cpufreq_gov_powersave' was not declared. Should it be static?
    drivers/cpufreq/cpufreq_userspace.c:190:25: warning: symbol 'cpufreq_gov_userspace' was not declared. Should it be static?

    Signed-off-by: Sven Wegener
    Signed-off-by: Dave Jones

    Sven Wegener
     
  • Use get_cpu_idle_time_us() to get micro-accounted idle information.
    This enables ondemand to get more accurate idle and busy timings
    than the jiffy based calculation. As a result, we can decrease
    the ondemand safety gaurd band from 80-10 to 95-3.

    Results in more aggressive power savings.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    venkatesh.pallipadi@intel.com
     
  • Use a parameter for down differential, instead of hardcoded 10%. Follow-on
    patch changes the down-differential dynamically, based on whether
    we are using idle micro-accounting or not.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Dave Jones

    venkatesh.pallipadi@intel.com