23 Feb, 2017

1 commit

  • These macros can be reused by governors which don't use the common
    governor code present in cpufreq_governor.c and should be moved to the
    relevant header.

    Now that they are getting moved to the right header file, reuse them in
    schedutil governor as well (that required rename of show/store
    routines).

    Also create gov_attr_wo() macro for write-only sysfs files, this will be
    used by Interactive governor in a later patch.

    Signed-off-by: Viresh Kumar

    Viresh Kumar
     

14 Sep, 2016

1 commit

  • Modify the schedutil cpufreq governor to boost the CPU
    frequency if the SCHED_CPUFREQ_IOWAIT flag is passed to
    it via cpufreq_update_util().

    If that happens, the frequency is set to the maximum during
    the first update after receiving the SCHED_CPUFREQ_IOWAIT flag
    and then the boost is reduced by half during each following update.

    Signed-off-by: Rafael J. Wysocki
    Looks-good-to: Steve Muckle
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

01 Sep, 2016

1 commit

  • PELT does not consider SMT when scaling its utilization values via
    arch_scale_cpu_capacity(). The value in rq->cpu_capacity_orig does
    take SMT into consideration though and therefore may be smaller than
    the utilization reported by PELT.

    On an Intel i7-3630QM for example rq->cpu_capacity_orig is 589 but
    util_avg scales up to 1024. This means that a 50% utilized CPU will show
    up in schedutil as ~86% busy.

    Fix this by using the same CPU scaling value in schedutil as that which
    is used by PELT.

    Signed-off-by: Steve Muckle
    Signed-off-by: Rafael J. Wysocki

    Steve Muckle
     

17 Aug, 2016

1 commit

  • It is useful to know the reason why cpufreq_update_util() has just
    been called and that can be passed as flags to cpufreq_update_util()
    and to the ->func() callback in struct update_util_data. However,
    doing that in addition to passing the util and max arguments they
    already take would be clumsy, so avoid it.

    Instead, use the observation that the schedutil governor is part
    of the scheduler proper, so it can access scheduler data directly.
    This allows the util and max arguments of cpufreq_update_util()
    and the ->func() callback in struct update_util_data to be replaced
    with a flags one, but schedutil has to be modified to follow.

    Thus make the schedutil governor obtain the CFS utilization
    information from the scheduler and use the "RT" and "DL" flags
    instead of the special utilization value of ULONG_MAX to track
    updates from the RT and DL sched classes. Make it non-modular
    too to avoid having to export scheduler variables to modules at
    large.

    Next, update all of the other users of cpufreq_update_util()
    and the ->func() callback in struct update_util_data accordingly.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Viresh Kumar

    Rafael J. Wysocki
     

22 Jul, 2016

1 commit

  • The slow-path frequency transition path is relatively expensive as it
    requires waking up a thread to do work. Should support be added for
    remote CPU cpufreq updates that is also expensive since it requires an
    IPI. These activities should be avoided if they are not necessary.

    To that end, calculate the actual driver-supported frequency required by
    the new utilization value in schedutil by using the recently added
    cpufreq_driver_resolve_freq API. If it is the same as the previously
    requested driver frequency then there is no need to continue with the
    update assuming the cpu frequency limits have not changed. This will
    have additional benefits should the semantics of the rate limit be
    changed to apply solely to frequency transitions rather than to
    frequency calculations in schedutil.

    The last raw required frequency is cached. This allows the driver
    frequency lookup to be skipped in the event that the new raw required
    frequency matches the last one, assuming a frequency update has not been
    forced due to limits changing (indicated by a next_freq value of
    UINT_MAX, see sugov_should_update_freq).

    Signed-off-by: Steve Muckle
    Reviewed-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Steve Muckle
     

03 Jun, 2016

2 commits

  • Create a new helper to avoid code duplication across governors.

    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     
  • The design of the cpufreq governor API is not very straightforward,
    as struct cpufreq_governor provides only one callback to be invoked
    from different code paths for different purposes. The purpose it is
    invoked for is determined by its second "event" argument, causing it
    to act as a "callback multiplexer" of sorts.

    Unfortunately, that leads to extra complexity in governors, some of
    which implement the ->governor() callback as a switch statement
    that simply checks the event argument and invokes a separate function
    to handle that specific event.

    That extra complexity can be eliminated by replacing the all-purpose
    ->governor() callback with a family of callbacks to carry out specific
    governor operations: initialization and exit, start and stop and policy
    limits updates. That also turns out to reduce the code size too, so
    do it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Viresh Kumar

    Rafael J. Wysocki
     

19 May, 2016

1 commit

  • Prefix print messages with KBUILD_MODNAME, i.e 'cpufreq_schedutil: '.
    This helps to keep similar formatting for all the print messages
    particular to a file and identify those easily in kernel logs.

    Its already done this way for rest of the governors.

    Along with that, remove the (now) redundant bits from a print message.

    Signed-off-by: Viresh Kumar
    Signed-off-by: Rafael J. Wysocki

    Viresh Kumar
     

09 Apr, 2016

1 commit

  • Due to differences in the cpufreq core's handling of runtime CPU
    offline and nonboot CPUs disabling during system suspend-to-RAM,
    fast frequency switching gets disabled after a suspend-to-RAM and
    resume cycle on all of the nonboot CPUs.

    To prevent that from happening, move the invocation of
    cpufreq_disable_fast_switch() from cpufreq_exit_governor() to
    sugov_exit(), as the schedutil governor is the only user of fast
    frequency switching today anyway.

    That simply prevents cpufreq_disable_fast_switch() from being called
    without invoking the ->governor callback for the CPUFREQ_GOV_POLICY_EXIT
    event (which happens during system suspend now).

    Fixes: b7898fda5bc7 (cpufreq: Support for fast frequency switching)
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Viresh Kumar

    Rafael J. Wysocki
     

02 Apr, 2016

1 commit

  • Add a new cpufreq scaling governor, called "schedutil", that uses
    scheduler-provided CPU utilization information as input for making
    its decisions.

    Doing that is possible after commit 34e2c555f3e1 (cpufreq: Add
    mechanism for registering utilization update callbacks) that
    introduced cpufreq_update_util() called by the scheduler on
    utilization changes (from CFS) and RT/DL task status updates.
    In particular, CPU frequency scaling decisions may be based on
    the the utilization data passed to cpufreq_update_util() by CFS.

    The new governor is relatively simple.

    The frequency selection formula used by it depends on whether or not
    the utilization is frequency-invariant. In the frequency-invariant
    case the new CPU frequency is given by

    next_freq = 1.25 * max_freq * util / max

    where util and max are the last two arguments of cpufreq_update_util().
    In turn, if util is not frequency-invariant, the maximum frequency in
    the above formula is replaced with the current frequency of the CPU:

    next_freq = 1.25 * curr_freq * util / max

    The coefficient 1.25 corresponds to the frequency tipping point at
    (util / max) = 0.8.

    All of the computations are carried out in the utilization update
    handlers provided by the new governor. One of those handlers is
    used for cpufreq policies shared between multiple CPUs and the other
    one is for policies with one CPU only (and therefore it doesn't need
    to use any extra synchronization means).

    The governor supports fast frequency switching if that is supported
    by the cpufreq driver in use and possible for the given policy.
    In the fast switching case, all operations of the governor take
    place in its utilization update handlers. If fast switching cannot
    be used, the frequency switch operations are carried out with the
    help of a work item which only calls __cpufreq_driver_target()
    (under a mutex) to trigger a frequency update (to a value already
    computed beforehand in one of the utilization update handlers).

    Currently, the governor treats all of the RT and DL tasks as
    "unknown utilization" and sets the frequency to the allowed
    maximum when updated from the RT or DL sched classes. That
    heavy-handed approach should be replaced with something more
    subtle and specifically targeted at RT and DL tasks.

    The governor shares some tunables management code with the
    "ondemand" and "conservative" governors and uses some common
    definitions from cpufreq_governor.h, but apart from that it
    is stand-alone.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Viresh Kumar
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki