21 Mar, 2016

1 commit

  • Commit a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable
    polling) changed the behavior of the fallback state selection part
    of menu_select() so it looks at interactivity_req instead of
    data->next_timer_us when it makes its decision. That effectively
    caused polling to be used more often as fallback idle which led to
    significant increases of energy consumption in some cases.

    Commit e132b9b3bc7f (cpuidle: menu: use high confidence factors
    only when considering polling) changed that logic again to be more
    predictable, but that didn't help with the increased energy
    consumption problem.

    For this reason, go back to making decisions on which state to fall
    back to based on data->next_timer_us which is the time we know for
    sure something will happen rather than a prediction (which may be
    inaccurate and turns out to be so often enough to be problematic).
    However, take the target residency of the first proper idle state
    (C1) into account, so that state is not used as the fallback one
    if its target residency is greater than data->next_timer_us.

    Fixes: a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable polling)
    Signed-off-by: Rafael J. Wysocki
    Reported-and-tested-by: Doug Smythies

    Rafael J. Wysocki
     

17 Mar, 2016

1 commit

  • The menu governor uses five different factors to pick the
    idle state:
    - the user configured latency_req
    - the time until the next timer (next_timer_us)
    - the typical sleep interval, as measured recently
    - an estimate of sleep time by dividing next_timer_us by an observed factor
    - a load corrected version of the above, divided again by load

    Only the first three items are known with enough confidence that
    we can use them to consider polling, instead of an actual CPU
    idle state, because the cost of being wrong about polling can be
    excessive power use.

    The latter two are used in the menu governor's main selection
    loop, and can result in choosing a shallower idle state when
    the system is expected to be busy again soon.

    This pushes a busy system in the "performance" direction of
    the performance<>power tradeoff, when choosing between idle
    states, but stays more strictly on the "power" state when
    deciding between polling and C1.

    Signed-off-by: Rik van Riel
    Signed-off-by: Rafael J. Wysocki

    Rik van Riel
     

17 Feb, 2016

2 commits


19 Jan, 2016

1 commit

  • If menu_select() cannot find a suitable state to return, it will
    return the state index stored in data->last_state_idx. This
    means that it is pointless to look at the states whose indices
    are less than or equal to data->last_state_idx in the main loop,
    so don't do that.

    Given that those checks are done on every idle state selection, this
    change can save quite a bit of completely unnecessary overhead.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Ingo Molnar
    Tested-by: Sudeep Holla

    Rafael J. Wysocki
     

15 Jan, 2016

1 commit

  • Commit a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable
    polling) exposed a bug in menu_select() causing it to return -1
    on systems with CPUIDLE_DRIVER_STATE_START equal to zero, although
    it should have returned 0. As a result, idle states are not entered
    by CPUs on those systems.

    Namely, on the systems in question data->last_state_idx is initially
    equal to -1 and the above commit modified the condition that would
    have caused it to be changed to 0 to be less likely to trigger which
    exposed the problem. However, setting data->last_state_idx initially
    to -1 doesn't make sense at all and on the affected systems it should
    always be set to CPUIDLE_DRIVER_STATE_START (ie. 0) unconditionally,
    so make that happen.

    Fixes: a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable polling)
    Reported-and-tested-by: Sudeep Holla
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

17 Nov, 2015

3 commits

  • The cpuidle state tables contain the maximum exit latency for each
    cpuidle state. On x86, that is the exit latency for when the entire
    package goes into that same idle state.

    However, a lot of the time we only go into the core idle state,
    not the package idle state. This means we see a much smaller exit
    latency.

    We have no way to detect whether we went into the core or package
    idle state while idle, and that is ok.

    However, the current menu_update logic does have the potential to
    trip up the repeating pattern detection in get_typical_interval.
    If the system is experiencing an exit latency near the idle state's
    exit latency, some of the samples will have exit_us subtracted,
    while others will not. This turns a repeating pattern into mush,
    potentially breaking get_typical_interval.

    Furthermore, for smaller sleep intervals, we know the chance that
    all the cores in the package went to the same idle state are fairly
    small. Dividing the measured_us by two, instead of subtracting the
    full exit latency when hitting a small measured_us, will reduce the
    error.

    Signed-off-by: Rik van Riel
    Acked-by: Arjan van de Ven
    Signed-off-by: Rafael J. Wysocki

    Rik van Riel
     
  • The menu governor carefully figures out how much time we typically
    sleep for an estimated sleep interval, or whether there is a repeating
    pattern going on, and corrects that estimate for the CPU load.

    Then it proceeds to ignore that information when determining whether
    or not to consider polling. This is not a big deal on most x86 CPUs,
    which have very low C1 latencies, and the patch should not have any
    effect on those CPUs.

    However, certain CPUs (eg. Atom) have much higher C1 latencies, and
    it would be good to not waste performance and power on those CPUs if
    we are expecting a very low wakeup latency.

    Disable polling based on the estimated interactivity requirement, not
    on the time to the next timer interrupt.

    Signed-off-by: Rik van Riel
    Acked-by: Arjan van de Ven
    Signed-off-by: Rafael J. Wysocki

    Rik van Riel
     
  • The cpuidle menu governor has a forced cut-off for polling at 5us,
    in order to deal with firmware that gives the OS bad information
    on cpuidle states, leading to the system spending way too much time
    in polling.

    However, at least one x86 CPU family (Atom) has chips that have
    a 20us break-even point for C1. Forcing the polling cut-off to
    less than that wastes performance and power.

    Increase the polling cut-off to 20us.

    Systems with a lower C1 latency will be found in the states table by
    the menu governor, which will pick those states as appropriate.

    Signed-off-by: Rik van Riel
    Acked-by: Arjan van de Ven
    Signed-off-by: Rafael J. Wysocki

    Rik van Riel
     

05 May, 2015

1 commit

  • Avoid calling the governor's ->reflect method if the state index
    passed to cpuidle_reflect() is negative.

    This allows the analogous check to be dropped from menu_reflect(),
    so do that too, and ensures that arbitrary error codes can be
    passed to cpuidle_reflect() as the index with no adverse
    consequences.

    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Daniel Lezcano
    Acked-by: Peter Zijlstra (Intel)

    Rafael J. Wysocki
     

17 Apr, 2015

1 commit

  • Now that the kernel provides DIV_ROUND_CLOSEST_ULL(), drop the internal
    implementation and use the kernel one.

    Signed-off-by: Javi Merino
    Acked-by: Rafael J. Wysocki
    Cc: Mel Gorman
    Cc: Stephen Hemminger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Javi Merino
     

17 Dec, 2014

1 commit

  • When menu sees CPUIDLE_FLAG_TIME_INVALID, it ignores its timestamps,
    and assumes that idle lasted as long as the time till next predicted
    timer expiration.

    But if an interrupt was seen and serviced before that duration,
    it would actually be more accurate to use the measured time
    rather than rounding up to the next predicted timer expiration.

    And if an interrupt is seen and serviced such that the mesured time
    exceeds the time till next predicted timer expiration, then
    truncating to that expiration is the right thing to do --
    since we can never stay idle past that timer expiration.

    So the code can do a better job without
    checking for CPUIDLE_FLAG_TIME_INVALID.

    Signed-off-by: Len Brown
    Acked-by: Daniel Lezcano
    Reviewed-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Len Brown
     

13 Nov, 2014

1 commit

  • The only place where the time is invalid is when the ACPI_CSTATE_FFH entry
    method is not set. Otherwise for all the drivers, the time can be correctly
    measured.

    Instead of duplicating the CPUIDLE_FLAG_TIME_VALID flag in all the drivers
    for all the states, just invert the logic by replacing it by the flag
    CPUIDLE_FLAG_TIME_INVALID, hence we can set this flag only for the acpi idle
    driver, remove the former flag from all the drivers and invert the logic with
    this flag in the different governor.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    Daniel Lezcano
     

27 Aug, 2014

1 commit


15 Aug, 2014

1 commit

  • Pull more ACPI and power management updates from Rafael Wysocki:
    "These are a couple of regression fixes, cpuidle menu governor
    optimizations, fixes for ACPI proccessor and battery drivers,
    hibernation fix to avoid problems related to the e820 memory map,
    fixes for a few cpufreq drivers and a new version of the suspend
    profiling tool analyze_suspend.py.

    Specifics:

    - Fix for an ACPI-based device hotplug regression introduced in 3.14
    that causes a kernel panic to trigger when memory hot-remove is
    attempted with CONFIG_ACPI_HOTPLUG_MEMORY unset from Tang Chen

    - Fix for a cpufreq regression introduced in 3.16 that triggers a
    "sleeping function called from invalid context" bug in
    dev_pm_opp_init_cpufreq_table() from Stephen Boyd

    - ACPI battery driver fix for a warning message added in 3.16 that
    prints silly stuff sometimes from Mariusz Ceier

    - Hibernation fix for safer handling of mismatches in the 820 memory
    map between the configurations during image creation and during the
    subsequent restore from Chun-Yi Lee

    - ACPI processor driver fix to handle CPU hotplug notifications
    correctly during system suspend/resume from Lan Tianyu

    - Series of four cpuidle menu governor cleanups that also should
    speed it up a bit from Mel Gorman

    - Fixes for the speedstep-smi, integrator, cpu0 and arm_big_little
    cpufreq drivers from Hans Wennborg, Himangi Saraogi, Markus
    Pargmann and Uwe Kleine-König

    - Version 3.0 of the analyze_suspend.py suspend profiling tool from
    Todd E Brandt"

    * tag 'pm+acpi-3.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI / battery: Fix warning message in acpi_battery_get_state()
    PM / tools: analyze_suspend.py: update to v3.0
    cpufreq: arm_big_little: fix module license spec
    cpufreq: speedstep-smi: fix decimal printf specifiers
    ACPI / hotplug: Check scan handlers in acpi_scan_hot_remove()
    cpufreq: OPP: Avoid sleeping while atomic
    cpufreq: cpu0: Do not print error message when deferring
    cpufreq: integrator: Use set_cpus_allowed_ptr
    PM / hibernate: avoid unsafe pages in e820 reserved regions
    ACPI / processor: Make acpi_cpu_soft_notify() process CPU FROZEN events
    cpuidle: menu: Lookup CPU runqueues less
    cpuidle: menu: Call nr_iowait_cpu less times
    cpuidle: menu: Use ktime_to_us instead of reinventing the wheel
    cpuidle: menu: Use shifts when calculating averages where possible

    Linus Torvalds
     

07 Aug, 2014

5 commits

  • Pull trivial tree changes from Jiri Kosina:
    "Summer edition of trivial tree updates"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
    doc: fix two typos in watchdog-api.txt
    irq-gic: remove file name from heading comment
    MAINTAINERS: Add miscdevice.h to file list for char/misc drivers.
    scsi: mvsas: mv_sas.c: Fix for possible null pointer dereference
    doc: replace "practise" with "practice" in Documentation
    befs: remove check for CONFIG_BEFS_RW
    scsi: doc: fix 'SCSI_NCR_SETUP_MASTER_PARITY'
    drivers/usb/phy/phy.c: remove a leading space
    mfd: fix comment
    cpuidle: fix comment
    doc: hpfall.c: fix missing null-terminate after strncpy call
    usb: doc: hotplug.txt code typos
    kbuild: fix comment in Makefile.modinst
    SH: add proper prompt to SH_MAGIC_PANEL_R2_VERSION
    ARM: msm: Remove MSM_SCM
    crypto: Remove MPILIB_EXTRA
    doc: CN: remove dead link, kerneltrap.org no longer works
    media: update reference, kerneltrap.org no longer works
    hexagon: update reference, kerneltrap.org no longer works
    doc: LSM: update reference, kerneltrap.org no longer works
    ...

    Linus Torvalds
     
  • The menu governer makes separate lookups of the CPU runqueue to get
    load and number of IO waiters but it can be done with a single lookup.

    Signed-off-by: Mel Gorman
    Signed-off-by: Rafael J. Wysocki

    Mel Gorman
     
  • menu_select() via inline functions calls nr_iowait_cpu() twice as much
    as necessary.

    Signed-off-by: Mel Gorman
    Signed-off-by: Rafael J. Wysocki

    Mel Gorman
     
  • The ktime_to_us implementation is slightly better than the one implemented
    in menu.c. Use it

    Signed-off-by: Mel Gorman
    Acked-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    Mel Gorman
     
  • We use do_div even though the divisor will usually be a power-of-two
    unless there are unusual outliers. Use shifts where possible

    Signed-off-by: Mel Gorman
    Signed-off-by: Rafael J. Wysocki

    Mel Gorman
     

28 Jul, 2014

1 commit


19 Jun, 2014

1 commit


01 May, 2014

2 commits


06 Mar, 2014

5 commits

  • The menu governor performance multiplier defines a minimum predicted
    idle duration to latency ratio. Instead of checking this separately
    in every iteration of the state selection loop, adjust the overall
    latency restriction for the whole loop if this restriction is tighter
    than what is set by the QoS subsystem.

    The original test
    s->exit_latency * multiplier > data->predicted_us
    becomes
    s->exit_latency > data->predicted_us / multiplier
    by dividing both sides of the comparison by "multiplier".

    While division is likely to be several times slower than multiplication,
    the minor performance hit allows making a generic sleep state selection
    function based on (sleep duration, maximum latency) tuple.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • The menu governor statistics update function tries to determine the
    amount of time between entry to low power state and the occurrence
    of the wakeup event. However, the time measured by the framework
    includes exit latency on top of the desired value. This exit latency
    is substracted from the measured value to obtain the desired value.

    When measured value is not available, the menu governor assumes
    the wakeup was caused by the timer and the time is equal to remaining
    timer length. No exit latency should be substracted from this value.

    This patch prevents the erroneous substraction and clarifies the
    associated comment. It also removes one intermediate variable that
    serves no purpose.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • The menu governor uses coefficients as one method of actual idle
    period length estimation. The coefficients are, as detailed below,
    multipliers giving expected idle period length from time until next
    timer expiry. The multipliers are supposed to have domain of (0..1].

    The coefficients are fractions where only the numerators are stored
    and denominators are a shared constant RESOLUTION*DECAY. Since the
    value of the coefficient should always be greater than 0 and less
    than or equal to 1, the numerator must have a value greater than
    0 and less than or equal to RESOLUTION*DECAY.

    If the coefficients are updated with measured idle durations exceeding
    timer length, the multiplier may reach values exceeding unity (i.e.
    the stored numerator exceeds RESOLUTION*DECAY). This patch ensures that
    the multipliers are updated with durations capped to timer length.

    Signed-off-by: Tuukka Tikkanen
    Acked-by: Nicolas Pitre
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • Currently menu governor records the exit latency of the state it has
    chosen for the idle period. The stored latency value is then later
    used to calculate the actual length of the idle period. This value
    may however be incorrect, as the entered state may not be the one
    chosen by the governor. The entered state information is available,
    so we can use that to obtain the real exit latency.

    Signed-off-by: Tuukka Tikkanen
    Acked-by: Daniel Lezcano
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     
  • The field expected_us is used to store the time remaining until next
    timer expiry. The name is inaccurate, as we really do not expect all
    wakeups to be caused by timers. In addition, another field with a very
    similar name (predicted_us) is used to store the predicted time
    remaining until any wakeup source being active.

    This patch renames expected_us to next_timer_us in order to better
    reflect the contained information.

    Signed-off-by: Tuukka Tikkanen
    Acked-by: Nicolas Pitre
    Acked-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    tuukka.tikkanen@linaro.org
     

23 Aug, 2013

8 commits

  • Field predicted_us value can never exceed expected_us value, but it has
    a potentially larger type. As there is no need for additional 32 bits of
    zeroes on 32 bit plaforms, change the type of predicted_us to match the
    type of expected_us.

    Field correction_factor is used to store a value that cannot exceed the
    product of RESOLUTION and DECAY (default 1024*8 = 8192). The constants
    cannot in practice be incremented to such values, that they'd overflow
    unsigned int even on 32 bit systems, so the type is changed to avoid
    unnecessary 64 bit arithmetic on 32 bit systems.

    One multiplication of (now) 32 bit values needs an added cast to avoid
    truncation of the result and has been added.

    In order to avoid another multiplication from 32 bit domain to 64 bit
    domain, the new correction_factor calculation has been changed from
    new = old * (DECAY-1) / DECAY
    to
    new = old - old / DECAY,
    which with infinite precision would yeild exactly the same result, but
    now changes the direction of rounding. The impact is not significant as
    the maximum accumulated difference cannot exceed the value of DECAY,
    which is relatively small compared to product of RESOLUTION and DECAY
    (8 / 8192).

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • The menu governor has a number of tunable constants that may be changed
    in the source. If certain combination of values are chosen, an overflow
    is possible when the correction_factor is being recalculated.

    This patch adds a warning regarding this possibility and describes the
    change needed for fixing the issue. The change should not be permanently
    enabled, as it will hurt performance when it is not needed.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • The menu governor uses a static function get_typical_interval() to
    try to detect a repeating pattern of wakeups. The previous interval
    durations are stored as an array of unsigned ints, but the arithmetic
    in the function is performed exclusively as 64 bit values, even when
    the value stored in a variable is known not to exceed unsigned int,
    which may be smaller and more efficient on some platforms.

    This patch changes the types of varibles used to store some
    intermediates, the maximum and and the cutoff threshold to unsigned
    ints. Average and standard deviation are still treated as 64 bit values,
    even when the values are known to be within the domain of unsigned int,
    to avoid casts to ensure correct integer promotion for arithmetic
    operations.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • Struct menu_device member intervals is declared as u32, but the value
    stored is (unsigned) int. The type is changed to match the value being
    stored.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • The function get_typical_interval() initializes a number of variables
    that are immediately after declarations assigned constant values.
    In addition, there are multiple assignments on a single line, which
    is explicitly forbidden by Documentation/CodingStyle.

    This patch removes redundant initial values for the variables and
    breaks up the multiple assignment line.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • get_typical_interval() uses int_sqrt() in calculation of standard
    deviation. The formal parameter of int_sqrt() is unsigned long, which
    may on some platforms be smaller than the 64 bit unsigned integer used
    as the actual parameter. The overflow can occur frequently when actual
    idle period lengths are in hundreds of milliseconds.

    This patch adds a check for such overflow and rejects the candidate
    average when an overflow would occur.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • This patch rearranges a if-return-elsif-goto-fi-return sequence into
    if-return-fi-if-return-fi-goto sequence. The functionality remains the
    same. Also, a lengthy comment that did not describe the functionality
    in the order it occurs is split into half and top half is moved closer
    to actual implementation it describes.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     
  • This patch prevents cpuidle menu governor from using repeating interval
    prediction result if the idle period predicted is longer than the one
    allowed by shortest running timer.

    Signed-off-by: Tuukka Tikkanen
    Signed-off-by: Rafael J. Wysocki

    Tuukka Tikkanen
     

15 Aug, 2013

1 commit


29 Jul, 2013

2 commits

  • Revert commit 69a37bea (cpuidle: Quickly notice prediction failure for
    repeat mode), because it has been identified as the source of a
    significant performance regression in v3.8 and later as explained by
    Jeremy Eder:

    We believe we've identified a particular commit to the cpuidle code
    that seems to be impacting performance of variety of workloads.
    The simplest way to reproduce is using netperf TCP_RR test, so
    we're using that, on a pair of Sandy Bridge based servers. We also
    have data from a large database setup where performance is also
    measurably/positively impacted, though that test data isn't easily
    share-able.

    Included below are test results from 3 test kernels:

    kernel reverts
    -----------------------------------------------------------
    1) vanilla upstream (no reverts)

    2) perfteam2 reverts e11538d1f03914eb92af5a1a378375c05ae8520c

    3) test reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4
    e11538d1f03914eb92af5a1a378375c05ae8520c

    In summary, netperf TCP_RR numbers improve by approximately 4%
    after reverting 69a37beabf1f0a6705c08e879bdd5d82ff6486c4. When
    69a37beabf1f0a6705c08e879bdd5d82ff6486c4 is included, C0 residency
    never seems to get above 40%. Taking that patch out gets C0 near
    100% quite often, and performance increases.

    The below data are histograms representing the %c0 residency @
    1-second sample rates (using turbostat), while under netperf test.

    - If you look at the first 4 histograms, you can see %c0 residency
    almost entirely in the 30,40% bin.
    - The last pair, which reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4,
    shows %c0 in the 80,90,100% bins.

    Below each kernel name are netperf TCP_RR trans/s numbers for the
    particular kernel that can be disclosed publicly, comparing the 3
    test kernels. We ran a 4th test with the vanilla kernel where
    we've also set /dev/cpu_dma_latency=0 to show overall impact
    boosting single-threaded TCP_RR performance over 11% above
    baseline.

    3.10-rc2 vanilla RX + c0 lock (/dev/cpu_dma_latency=0):
    TCP_RR trans/s 54323.78

    -----------------------------------------------------------
    3.10-rc2 vanilla RX (no reverts)
    TCP_RR trans/s 48192.47

    Receiver %c0
    0.0000 - 10.0000 [ 1]: *
    10.0000 - 20.0000 [ 0]:
    20.0000 - 30.0000 [ 0]:
    30.0000 - 40.0000 [ 59]:
    ***********************************************************
    40.0000 - 50.0000 [ 1]: *
    50.0000 - 60.0000 [ 0]:
    60.0000 - 70.0000 [ 0]:
    70.0000 - 80.0000 [ 0]:
    80.0000 - 90.0000 [ 0]:
    90.0000 - 100.0000 [ 0]:

    Sender %c0
    0.0000 - 10.0000 [ 1]: *
    10.0000 - 20.0000 [ 0]:
    20.0000 - 30.0000 [ 0]:
    30.0000 - 40.0000 [ 11]: ***********
    40.0000 - 50.0000 [ 49]:
    *************************************************
    50.0000 - 60.0000 [ 0]:
    60.0000 - 70.0000 [ 0]:
    70.0000 - 80.0000 [ 0]:
    80.0000 - 90.0000 [ 0]:
    90.0000 - 100.0000 [ 0]:

    -----------------------------------------------------------
    3.10-rc2 perfteam2 RX (reverts commit
    e11538d1f03914eb92af5a1a378375c05ae8520c)
    TCP_RR trans/s 49698.69

    Receiver %c0
    0.0000 - 10.0000 [ 1]: *
    10.0000 - 20.0000 [ 1]: *
    20.0000 - 30.0000 [ 0]:
    30.0000 - 40.0000 [ 59]:
    ***********************************************************
    40.0000 - 50.0000 [ 0]:
    50.0000 - 60.0000 [ 0]:
    60.0000 - 70.0000 [ 0]:
    70.0000 - 80.0000 [ 0]:
    80.0000 - 90.0000 [ 0]:
    90.0000 - 100.0000 [ 0]:

    Sender %c0
    0.0000 - 10.0000 [ 1]: *
    10.0000 - 20.0000 [ 0]:
    20.0000 - 30.0000 [ 0]:
    30.0000 - 40.0000 [ 2]: **
    40.0000 - 50.0000 [ 58]:
    **********************************************************
    50.0000 - 60.0000 [ 0]:
    60.0000 - 70.0000 [ 0]:
    70.0000 - 80.0000 [ 0]:
    80.0000 - 90.0000 [ 0]:
    90.0000 - 100.0000 [ 0]:

    -----------------------------------------------------------
    3.10-rc2 test RX (reverts 69a37beabf1f0a6705c08e879bdd5d82ff6486c4
    and e11538d1f03914eb92af5a1a378375c05ae8520c)
    TCP_RR trans/s 47766.95

    Receiver %c0
    0.0000 - 10.0000 [ 1]: *
    10.0000 - 20.0000 [ 1]: *
    20.0000 - 30.0000 [ 0]:
    30.0000 - 40.0000 [ 27]: ***************************
    40.0000 - 50.0000 [ 2]: **
    50.0000 - 60.0000 [ 0]:
    60.0000 - 70.0000 [ 2]: **
    70.0000 - 80.0000 [ 0]:
    80.0000 - 90.0000 [ 0]:
    90.0000 - 100.0000 [ 28]: ****************************

    Sender:
    0.0000 - 10.0000 [ 1]: *
    10.0000 - 20.0000 [ 0]:
    20.0000 - 30.0000 [ 0]:
    30.0000 - 40.0000 [ 11]: ***********
    40.0000 - 50.0000 [ 0]:
    50.0000 - 60.0000 [ 1]: *
    60.0000 - 70.0000 [ 0]:
    70.0000 - 80.0000 [ 3]: ***
    80.0000 - 90.0000 [ 7]: *******
    90.0000 - 100.0000 [ 38]: **************************************

    These results demonstrate gaining back the tendency of the CPU to
    stay in more responsive, performant C-states (and thus yield
    measurably better performance), by reverting commit
    69a37beabf1f0a6705c08e879bdd5d82ff6486c4.

    Requested-by: Jeremy Eder
    Tested-by: Len Brown
    Cc: 3.8+
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • Revert commit e11538d1 (cpuidle: Quickly notice prediction failure in
    general case), since it depends on commit 69a37be (cpuidle: Quickly
    notice prediction failure for repeat mode) that has been identified
    as the source of a significant performance regression in v3.8 and
    later.

    Requested-by: Jeremy Eder
    Tested-by: Len Brown
    Cc: 3.8+
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki