20 Jul, 2012

2 commits

  • Under some conditions, c1% was displayed as very large number,
    much higher than 100%.

    c1% is not measured, it is derived as "that, which is left over"
    from other counters. However, the other counters are not collected
    atomically, and so it is possible for c1% to be calaculagted as
    a small negative number -- displayed as very large positive.

    There was a check for mperf vs tsc for this already,
    but it needed to also include the other counters
    that are used to calculate c1.

    Signed-off-by: Len Brown

    Len Brown
     
  • Measuring large profoundly-idle configurations
    requires turbostat to be more lightweight.
    Otherwise, the operation of turbostat itself
    can interfere with the measurements.

    This re-write makes turbostat topology aware.
    Hardware is accessed in "topology order".
    Redundant hardware accesses are deleted.
    Redundant output is deleted.
    Also, output is buffered and
    local RDTSC use replaces remote MSR access for TSC.

    From a feature point of view, the output
    looks different since redundant figures are absent.
    Also, there are now -c and -p options -- to restrict
    output to the 1st thread in each core, and the 1st
    thread in each package, respectively. This is helpful
    to reduce output on big systems, where more detail
    than the "-s" system summary is desired.
    Finally, periodic mode output is now on stdout, not stderr.

    Turbostat v2 is also slightly more robust in
    handling run-time CPU online/offline events,
    as it now checks the actual map of on-line cpus rather
    than just the total number of on-line cpus.

    Signed-off-by: Len Brown

    Len Brown
     

04 Jun, 2012

2 commits

  • Initial IVB support went into turbostat in Linux-3.1:
    553575f1ae048aa44682b46b3c51929a0b3ad337
    (tools turbostat: recognize and run properly on IVB)

    However, when running on IVB, turbostat would fail
    to report the new couters added with SNB, c7, pc2 and pc7.
    So in scenarios where these counters are non-zero on IVB,
    turbostat would report erroneous residencey results.

    In particular c7 time would be added to c1 time,
    since c1 time is calculated as "that which is left over".

    Also, turbostat reports MHz capabilities when passed
    the "-v" option, and it would incorrectly report 133MHz
    bclk instead of 100MHz bclk for IVB, which would inflate
    GHz reported with that option.

    This patch is a backport of a fix already included in turbostat v2.

    Signed-off-by: Len Brown

    Len Brown
     
  • Linux 3.4 included a modification to turbostat to
    lower cross-call overhead by using scheduler affinity:

    15aaa34654831e98dd76f7738b6c7f5d05a66430
    (tools turbostat: reduce measurement overhead due to IPIs)

    In the use-case where turbostat forks a child program,
    that change had the un-intended side-effect of binding
    the child to the last cpu in the system.

    This change removed the binding before forking the child.

    This is a back-port of a fix already included in turbostat v2.

    Signed-off-by: Len Brown

    Len Brown
     

17 May, 2012

1 commit

  • It's been broken forever (i.e. it's not scheduling in a power
    aware fashion), as reported by Suresh and others sending
    patches, and nobody cares enough to fix it properly ...
    so remove it to make space free for something better.

    There's various problems with the code as it stands today, first
    and foremost the user interface which is bound to topology
    levels and has multiple values per level. This results in a
    state explosion which the administrator or distro needs to
    master and almost nobody does.

    Furthermore large configuration state spaces aren't good, it
    means the thing doesn't just work right because it's either
    under so many impossibe to meet constraints, or even if
    there's an achievable state workloads have to be aware of
    it precisely and can never meet it for dynamic workloads.

    So pushing this kind of decision to user-space was a bad idea
    even with a single knob - it's exponentially worse with knobs
    on every node of the topology.

    There is a proposal to replace the user interface with a single
    3 state knob:

    sched_balance_policy := { performance, power, auto }

    where 'auto' would be the preferred default which looks at things
    like Battery/AC mode and possible cpufreq state or whatever the hw
    exposes to show us power use expectations - but there's been no
    progress on it in the past many months.

    Aside from that, the actual implementation of the various knobs
    is known to be broken. There have been sporadic attempts at
    fixing things but these always stop short of reaching a mergable
    state.

    Therefore this wholesale removal with the hopes of spurring
    people who care to come forward once again and work on a
    coherent replacement.

    Signed-off-by: Peter Zijlstra
    Cc: Suresh Siddha
    Cc: Arjan van de Ven
    Cc: Vincent Guittot
    Cc: Vaidyanathan Srinivasan
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

31 Mar, 2012

1 commit

  • Pull ACPI & Power Management changes from Len Brown:
    - ACPI 5.0 after-ripples, ACPICA/Linux divergence cleanup
    - cpuidle evolving, more ARM use
    - thermal sub-system evolving, ditto
    - assorted other PM bits

    Fix up conflicts in various cpuidle implementations due to ARM cpuidle
    cleanups (ARM at91 self-refresh and cpu idle code rewritten into
    "standby" in asm conflicting with the consolidation of cpuidle time
    keeping), trivial SH include file context conflict and RCU tracing fixes
    in generic code.

    * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (77 commits)
    ACPI throttling: fix endian bug in acpi_read_throttling_status()
    Disable MCP limit exceeded messages from Intel IPS driver
    ACPI video: Don't start video device until its associated input device has been allocated
    ACPI video: Harden video bus adding.
    ACPI: Add support for exposing BGRT data
    ACPI: export acpi_kobj
    ACPI: Fix logic for removing mappings in 'acpi_unmap'
    CPER failed to handle generic error records with multiple sections
    ACPI: Clean redundant codes in scan.c
    ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()
    ACPI: consistently use should_use_kmap()
    PNPACPI: Fix device ref leaking in acpi_pnp_match
    ACPI: Fix use-after-free in acpi_map_lsapic
    ACPI: processor_driver: add missing kfree
    ACPI, APEI: Fix incorrect APEI register bit width check and usage
    Update documentation for parameter *notrigger* in einj.txt
    ACPI, APEI, EINJ, new parameter to control trigger action
    ACPI, APEI, EINJ, limit the range of einj_param
    ACPI, APEI, Fix ERST header length check
    cpuidle: power_usage should be declared signed integer
    ...

    Linus Torvalds
     

30 Mar, 2012

3 commits

  • Sometimes users have turbostat running in interval mode
    when they take processors offline/online.

    Previously, turbostat would survive, but not gracefully.

    Tighten up the error checking so turbostat notices
    changesn sooner, and print just 1 line on change:

    turbostat: re-initialized with num_cpus %d

    Signed-off-by: Len Brown

    Len Brown
     
  • turbostat uses /dev/cpu/*/msr interface to read MSRs.
    For modern systems, it reads 10 MSR/CPU. This can
    be observed as 10 "Function Call Interrupts"
    per CPU per sample added to /proc/interrupts.

    This overhead is measurable on large idle systems,
    and as Yoquan Song pointed out, it can even trick
    cpuidle into thinking the system is busy.

    Here turbostat re-schedules itself in-turn to each
    CPU so that its MSR reads will always be local.
    This replaces the 10 "Function Call Interrupts"
    with a single "Rescheduling interrupt" per sample
    per CPU.

    On an idle 32-CPU system, this shifts some residency from
    the shallow c1 state to the deeper c7 state:

    # ./turbostat.old -s
    %c0 GHz TSC %c1 %c3 %c6 %c7 %pc2 %pc3 %pc6 %pc7
    0.27 1.29 2.29 0.95 0.02 0.00 98.77 20.23 0.00 77.41 0.00
    0.25 1.24 2.29 0.98 0.02 0.00 98.75 20.34 0.03 77.74 0.00
    0.27 1.22 2.29 0.54 0.00 0.00 99.18 20.64 0.00 77.70 0.00
    0.26 1.22 2.29 1.22 0.00 0.00 98.52 20.22 0.00 77.74 0.00
    0.26 1.38 2.29 0.78 0.02 0.00 98.95 20.51 0.05 77.56 0.00
    ^C
    i# ./turbostat.new -s
    %c0 GHz TSC %c1 %c3 %c6 %c7 %pc2 %pc3 %pc6 %pc7
    0.27 1.20 2.29 0.24 0.01 0.00 99.49 20.58 0.00 78.20 0.00
    0.27 1.22 2.29 0.25 0.00 0.00 99.48 20.79 0.00 77.85 0.00
    0.27 1.20 2.29 0.25 0.02 0.00 99.46 20.71 0.03 77.89 0.00
    0.28 1.26 2.29 0.25 0.01 0.00 99.46 20.89 0.02 77.67 0.00
    0.27 1.20 2.29 0.24 0.01 0.00 99.48 20.65 0.00 78.04 0.00

    cc: Youquan Song
    Signed-off-by: Len Brown

    Len Brown
     
  • turbostat -s
    cuts down on the amount of output, per user request.

    also treak some output whitespace and the man page.

    Signed-off-by: Len Brown

    Len Brown
     

03 Mar, 2012

13 commits


19 Jan, 2012

1 commit

  • This includes initial support for the recently published ACPI 5.0 spec.
    In particular, support for the "hardware-reduced" bit that eliminates
    the dependency on legacy hardware.

    APEI has patches resulting from testing on real hardware.

    Plus other random fixes.

    * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (52 commits)
    acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
    intel_idle: Split up and provide per CPU initialization func
    ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
    ACPI processor: Remove unneeded cpuidle_unregister_driver call
    intel idle: Make idle driver more robust
    intel_idle: Fix a cast to pointer from integer of different size warning in intel_idle
    ACPI: kernel-parameters.txt : Add intel_idle.max_cstate
    intel_idle: remove redundant local_irq_disable() call
    ACPI processor: Fix error path, also remove sysdev link
    ACPI: processor: fix acpi_get_cpuid for UP processor
    intel_idle: fix API misuse
    ACPI APEI: Convert atomicio routines
    ACPI: Export interfaces for ioremapping/iounmapping ACPI registers
    ACPI: Fix possible alignment issues with GAS 'address' references
    ACPI, ia64: Use SRAT table rev to use 8bit or 16/32bit PXM fields (ia64)
    ACPI, x86: Use SRAT table rev to use 8bit or 32bit PXM fields (x86/x86-64)
    ACPI: Store SRAT table revision
    ACPI, APEI, Resolve false conflict between ACPI NVS and APEI
    ACPI, Record ACPI NVS regions
    ACPI, APEI, EINJ, Refine the fix of resource conflict
    ...

    Linus Torvalds
     

18 Jan, 2012

1 commit


15 Dec, 2011

1 commit


18 Nov, 2011

1 commit


07 Nov, 2011

1 commit


19 Aug, 2011

3 commits


16 Aug, 2011

5 commits


03 Aug, 2011

2 commits


30 Jul, 2011

3 commits

  • IA32-Intel Devel guide Volume 3A - 14.3.2.1
    -------------------------------------------
    ...
    Opportunistic processor performance operation can be disabled by setting bit 38 of
    IA32_MISC_ENABLES. This mechanism is intended for BIOS only. If
    IA32_MISC_ENABLES[38] is set, CPUID.06H:EAX[1] will return 0.

    Better detect things via cpuid, this cleans up the code a bit
    and the MSR parts were not working correctly anyway.

    Signed-off-by: Thomas Renninger
    CC: lenb@kernel.org
    CC: linux@dominikbrodowski.net
    CC: cpufreq@vger.kernel.org
    Signed-off-by: Dominik Brodowski

    Thomas Renninger
     
  • This adds the last piece missing from turbostat (if called with -v).
    It shows on Intel machines supporting Turbo Boost how many cores
    have to be active/idle to enter which boost mode (frequency).

    Whether the HW really enters these boost modes can be verified via
    ./cpupower monitor.

    Signed-off-by: Thomas Renninger
    CC: lenb@kernel.org
    CC: linux@dominikbrodowski.net
    CC: cpufreq@vger.kernel.org
    Signed-off-by: Dominik Brodowski

    Thomas Renninger
     
  • larger sysfs data (>255 bytes) was truncated and thus used improperly

    [linux@dominikbrodowski.net: adapted to cpupowerutils]
    Signed-off-by: Roman Vasiyarov
    Signed-off-by: Dominik Brodowski

    Roman Vasiyarov