Eric Lee / smarc-fsl-linux-kernel

12 Mar, 2019

1 commit

8e3b40395 cpufreq: intel_pstate: Fix up iowait_boost computation ... Browse Code »

After commit b8bd1581aa61 ("cpufreq: intel_pstate: Rework iowait
boosting to be less aggressive") the handling of the case when
the SCHED_CPUFREQ_IOWAIT flag is set again after a few iterations of
intel_pstate_update_util() is a bit inconsistent, because the
new value of cpu->iowait_boost may be lower than ONE_EIGHTH_FP
if it was set before, but has not dropped down to zero just yet.

Fix that up by ensuring that the new value of cpu->iowait_boost
will always be at least ONE_EIGHTH_FP then.

Fixes: b8bd1581aa61 ("cpufreq: intel_pstate: Rework iowait boosting to be less aggressive")
Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2019-03-12 16:47:30 +0800

18 Feb, 2019

3 commits

b8bd1581a cpufreq: intel_pstate: Rework iowait boosting to be less aggressive ... Browse Code »

The current iowait boosting mechanism in intel_pstate_update_util()
is quite aggressive, as it goes to the maximum P-state right away,
and may cause excessive amounts of energy to be used, which is not
desirable and arguably isn't necessary too.

Follow commit a5a0809bc58e ("cpufreq: schedutil: Make iowait boost
more energy efficient") that reworked the analogous iowait boost
mechanism in the schedutil governor and make the iowait boosting
in intel_pstate_update_util() work along the same lines.

Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2019-02-18 18:34:32 +0800
a8e1942d9 cpufreq: intel_pstate: Eliminate intel_pstate_get_base_pstate() ... Browse Code »

There is only one caller of intel_pstate_get_base_pstate() and it is
more straightforward to carry out the computation directly in the
caller, so do that and drop intel_pstate_get_base_pstate().

No intentional changes of behavior.

Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2019-02-18 18:34:32 +0800
fa93b51c5 cpufreq: intel_pstate: Avoid redundant initialization of local vars ... Browse Code »

After commit 1a4fe38add8b ("cpufreq: intel_pstate: Remove max/min
fractions to limit performance") the initial value of the pstate local
variable in intel_pstate_max_within_limits() and the initial value of
the max_pstate local variable in intel_pstate_prepare_request() are
both immediately discarded, so initialize both these variables to
their target values upfront.

No intentional changes of behavior.

Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2019-02-18 18:34:32 +0800

13 Feb, 2019

1 commit

076b862c7 cpufreq: intel_pstate: Add reasons for failure and debug messages ... Browse Code »

The init code path has several exceptions where the driver can
decide not to load.

As CONFIG_X86_INTEL_PSTATE is generally set to Y, the return code is
not reachable. The initialization code is neither verbose of the
reason why it did choose to prematurely exit, so it is difficult for
a user to determine, on a given platform, why the driver didn't load
properly.

This patch is about reporting to the user the reason/context of why
the driver failed to load. That is a precious hint when debugging
a platform.

Signed-off-by: Erwan Velu
[ rjw: Subject & changelog, minor fixups ]
Signed-off-by: Rafael J. Wysocki

Erwan Velu
2019-02-13 19:32:10 +0800

29 Jan, 2019

1 commit

625c85a62 cpufreq: Use struct kobj_attribute instead of struct global_attr ... Browse Code »

The cpufreq_global_kobject is created using kobject_create_and_add()
helper, which assigns the kobj_type as dynamic_kobj_ktype and show/store
routines are set to kobj_attr_show() and kobj_attr_store().

These routines pass struct kobj_attribute as an argument to the
show/store callbacks. But all the cpufreq files created using the
cpufreq_global_kobject expect the argument to be of type struct
attribute. Things work fine currently as no one accesses the "attr"
argument. We may not see issues even if the argument is used, as struct
kobj_attribute has struct attribute as its first element and so they
will both get same address.

But this is logically incorrect and we should rather use struct
kobj_attribute instead of struct global_attr in the cpufreq core and
drivers and the show/store callbacks should take struct kobj_attribute
as argument instead.

This bug is caught using CFI CLANG builds in android kernel which
catches mismatch in function prototypes for such callbacks.

Reported-by: Donghee Han
Reported-by: Sangkyu Kim
Signed-off-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Viresh Kumar
2019-01-29 18:44:30 +0800

27 Dec, 2018

1 commit

792bf4d87 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RCU updates from Ingo Molnar:
"The biggest RCU changes in this cycle were:

- Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

- Replace calls of RCU-bh and RCU-sched update-side functions to
their vanilla RCU counterparts. This series is a step towards
complete removal of the RCU-bh and RCU-sched update-side functions.

( Note that some of these conversions are going upstream via their
respective maintainers. )

- Documentation updates, including a number of flavor-consolidation
updates from Joel Fernandes.

- Miscellaneous fixes.

- Automate generation of the initrd filesystem used for rcutorture
testing.

- Convert spin_is_locked() assertions to instead use lockdep.

( Note that some of these conversions are going upstream via their
respective maintainers. )

- SRCU updates, especially including a fix from Dennis Krein for a
bag-on-head-class bug.

- RCU torture-test updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits)
rcutorture: Don't do busted forward-progress testing
rcutorture: Use 100ms buckets for forward-progress callback histograms
rcutorture: Recover from OOM during forward-progress tests
rcutorture: Print forward-progress test age upon failure
rcutorture: Print time since GP end upon forward-progress failure
rcutorture: Print histogram of CB invocation at OOM time
rcutorture: Print GP age upon forward-progress failure
rcu: Print per-CPU callback counts for forward-progress failures
rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings
rcutorture: Dump grace-period diagnostics upon forward-progress OOM
rcutorture: Prepare for asynchronous access to rcu_fwd_startat
torture: Remove unnecessary "ret" variables
rcutorture: Affinity forward-progress test to avoid housekeeping CPUs
rcutorture: Break up too-long rcu_torture_fwd_prog() function
rcutorture: Remove cbflood facility
torture: Bring any extra CPUs online during kernel startup
rcutorture: Add call_rcu() flooding forward-progress tests
rcutorture/formal: Replace synchronize_sched() with synchronize_rcu()
tools/kernel.h: Replace synchronize_sched() with synchronize_rcu()
net/decnet: Replace rcu_barrier_bh() with rcu_barrier()
...

Linus Torvalds
2018-12-27 05:07:19 +0800

30 Nov, 2018

1 commit

af3b7379e cpufreq: intel_pstate: Force HWP min perf before offline ... Browse Code »

Force HWP Request MAX = HWP Request MIN = HWP Capability MIN and EPP to
0xFF. In this way the performance limits on the offlined CPU will not
influence performance limits on its sibling CPU, which is still online.

If the sibling CPU is calling for higher performance, it will impact the
max core performance. Here core performance will follow higher of the
performance requests from each sibling.

Reported-and-tested-by: Chen Yu
Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-11-30 05:31:58 +0800

28 Nov, 2018

1 commit

09659af30 cpufreq/intel_pstate: Replace synchronize_sched() with synchronize_rcu() ... Browse Code »

Now that synchronize_rcu() waits for preempt-disable regions of code
as well as RCU read-side critical sections, synchronize_sched() can be
replaced by synchronize_rcu(). This commit therefore makes this change.

Signed-off-by: Paul E. McKenney
Cc: Srinivas Pandruvada
Cc: Len Brown
Cc: "Rafael J. Wysocki"
Cc: Viresh Kumar
Cc:

Paul E. McKenney
2018-11-28 01:21:38 +0800

26 Oct, 2018

1 commit

5906056e5 cpufreq: intel_pstate: Fix compilation for !CONFIG_ACPI ... Browse Code »

While at it, add a few comments which config options #ifdef
and #else statements refer to.

Fixes: 86d333a8cc7f (cpufreq: intel_pstate: Add base_frequency attribute)
Signed-off-by: Dominik Brodowski
Acked-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Dominik Brodowski
2018-10-26 00:37:06 +0800

23 Oct, 2018

1 commit

c05f3642f Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf updates from Ingo Molnar:
"The main updates in this cycle were:

- Lots of perf tooling changes too voluminous to list (big perf trace
and perf stat improvements, lots of libtraceevent reorganization,
etc.), so I'll list the authors and refer to the changelog for
details:

Benjamin Peterson, Jérémie Galarneau, Kim Phillips, Peter
Zijlstra, Ravi Bangoria, Sangwon Hong, Sean V Kelley, Steven
Rostedt, Thomas Gleixner, Ding Xiang, Eduardo Habkost, Thomas
Richter, Andi Kleen, Sanskriti Sharma, Adrian Hunter, Tzvetomir
Stoyanov, Arnaldo Carvalho de Melo, Jiri Olsa.

... with the bulk of the changes written by Jiri Olsa, Tzvetomir
Stoyanov and Arnaldo Carvalho de Melo.

- Continued intel_rdt work with a focus on playing well with perf
events. This also imported some non-perf RDT work due to
dependencies. (Reinette Chatre)

- Implement counter freezing for Arch Perfmon v4 (Skylake and newer).
This allows to speed up the PMI handler by avoiding unnecessary MSR
writes and make it more accurate. (Andi Kleen)

- kprobes cleanups and simplification (Masami Hiramatsu)

- Intel Goldmont PMU updates (Kan Liang)

- ... plus misc other fixes and updates"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (155 commits)
kprobes/x86: Use preempt_enable() in optimized_callback()
x86/intel_rdt: Prevent pseudo-locking from using stale pointers
kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
perf/x86/intel: Export mem events only if there's PEBS support
x86/cpu: Drop pointless static qualifier in punit_dev_state_show()
x86/intel_rdt: Fix initial allocation to consider CDP
x86/intel_rdt: CBM overlap should also check for overlap with CDP peer
x86/intel_rdt: Introduce utility to obtain CDP peer
tools lib traceevent, perf tools: Move struct tep_handler definition in a local header file
tools lib traceevent: Separate out tep_strerror() for strerror_r() issues
perf python: More portable way to make CFLAGS work with clang
perf python: Make clang_has_option() work on Python 3
perf tools: Free temporary 'sys' string in read_event_files()
perf tools: Avoid double free in read_event_file()
perf tools: Free 'printk' string in parse_ftrace_printk()
perf tools: Cleanup trace-event-info 'tdata' leak
perf strbuf: Match va_{add,copy} with va_end
perf test: S390 does not support watchpoints in test 22
perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONG
tools include: Adopt linux/bits.h
...

Linus Torvalds
2018-10-23 20:32:18 +0800

16 Oct, 2018

1 commit

86d333a8c cpufreq: intel_pstate: Add base_frequency attribute ... Browse Code »

Expose base_frequency to user space via cpufreq sysfs when HWP is in
use.

This HWP base frequency is read from the ACPI _CPC object if present,
or from the HWP Capabilities MSR otherwise.

On the majority of the HWP platforms the _CPC object will point to
the HWP Capabilities MSR using the "Functional Fixed Hardware"
address space type. The address space type also can simply be
ACPI_TYPE_INTEGER, however, in which case the platform firmware
can set its value at the initialization time based on the system
constraints.

Signed-off-by: Srinivas Pandruvada
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-10-16 16:33:39 +0800

02 Oct, 2018

1 commit

f2c4db1bd x86/cpu: Sanitize FAM6_ATOM naming ... Browse Code »

Going primarily by:

https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors

with additional information gleaned from other related pages; notably:

- Bonnell shrink was called Saltwell
- Moorefield is the Merriefield refresh which makes it Airmont

The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE

for i in `git grep -l FAM6_ATOM` ; do
sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \
-e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \
-e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \
-e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \
-e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \
-e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \
-e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \
-e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \
-e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \
-e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \
-e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i}
done

Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Vince Weaver
Cc: dave.hansen@linux.intel.com
Cc: len.brown@intel.com
Signed-off-by: Ingo Molnar

Peter Zijlstra
2018-10-02 16:14:32 +0800

06 Aug, 2018

2 commits

d3264f752 cpufreq: intel_pstate: Ignore turbo active ratio in HWP ... Browse Code »

When HWP is active turbo active ratio is not used, so we should allow
policy max frequency above turbo activation ratio to be set. When HWP is
not active, then any policy max frequency above turbo activation ratio
can result upto max one-core turbo frequency.

This fix helps better thermal control in turbo region when other methods
like "Running Average Power Limit" is not available to use.

Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-08-06 16:22:27 +0800
6ccbe1dcd Merge back cpufreq changes for 4.19. Browse Code »

Rafael J. Wysocki
2018-08-06 16:09:52 +0800

31 Jul, 2018

1 commit

01e61a42a cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platforms ... Browse Code »

Dynamic boosting of HWP performance on IO wake showed significant
improvement to IO workloads. This series was intended for Skylake Xeon
platforms only and feature was enabled by default based on CPU model
number.

But some Xeon platforms reused the Skylake desktop CPU model number. This
caused some undesirable side effects to some graphics workloads. Since
they are heavily IO bound, the increase in CPU performance decreased the
power available for GPU to do its computing and hence decrease in graphics
benchmark performance.

For example on a Skylake desktop, GpuTest benchmark showed average FPS
reduction from 529 to 506.

This change makes sure that HWP boost feature is only enabled for Skylake
server platforms by using ACPI FADT preferred PM Profile. If some desktop
users wants to get benefit of boost, they can still enable boost from
intel_pstate sysfs attribute "hwp_dynamic_boost".

Fixes: 41ab43c9c89e (cpufreq: intel_pstate: enable boost for Skylake Xeon)
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410
Reported-by: Eero Tamminen
Signed-off-by: Srinivas Pandruvada
Reviewed-by: Francisco Jerez
Acked-by: Mel Gorman
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-07-31 16:39:58 +0800

25 Jul, 2018

1 commit

6e926363f Merge back cpufreq material for 4.19. Browse Code »

Rafael J. Wysocki
2018-07-25 19:28:01 +0800

19 Jul, 2018

1 commit

eea033d07 cpufreq: intel_pstate: Show different max frequency with turbo 3 and HWP ... Browse Code »

On HWP platforms with Turbo 3.0, the HWP capability max ratio shows the
maximum ratio of that core, which can be different than other cores. If
we show the correct maximum frequency in cpufreq sysfs via
cpuinfo_max_freq and scaling_max_freq then, user can know which cores
can run faster for pinning some high priority tasks.

Currently the max turbo frequency is shown as max frequency, which is
the max of all cores, even if some cores can't reach that frequency
even for single threaded workload.

But it is possible that max ratio in HWP capabilities is set as 0xFF or
some high invalid value (E.g. One KBL NUC). Since the actual performance
can never exceed 1 core turbo frequency from MSR TURBO_RATIO_LIMIT, we
use this as a bound check.

Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-07-19 18:53:03 +0800

18 Jul, 2018

1 commit

95d6c0857 cpufreq: intel_pstate: Register when ACPI PCCH is present ... Browse Code »

Currently, intel_pstate doesn't register if _PSS is not present on
HP Proliant systems, because it expects the firmware to take over
CPU performance scaling in that case. However, if ACPI PCCH is
present, the firmware expects the kernel to use it for CPU
performance scaling and the pcc-cpufreq driver is loaded for that.

Unfortunately, the firmware interface used by that driver is not
scalable for fundamental reasons, so pcc-cpufreq is way suboptimal
on systems with more than just a few CPUs. In fact, it is better to
avoid using it at all.

For this reason, modify intel_pstate to look for ACPI PCCH if _PSS
is not present and register if it is there. Also prevent the
pcc-cpufreq driver from trying to initialize itself if intel_pstate
has been registered already.

Fixes: fbbcdc0744da (intel_pstate: skip the driver if ACPI has power mgmt option)
Reported-by: Andreas Herrmann
Reviewed-by: Andreas Herrmann
Acked-by: Srinivas Pandruvada
Tested-by: Andreas Herrmann
Cc: 4.16+ # 4.16+
Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2018-07-18 19:38:37 +0800

02 Jul, 2018

1 commit

1111b7836 cpufreq: intel_pstate: use match_string() helper ... Browse Code »

match_string() returns the index of an array for a matching string,
which can be used instead of open coded variant.

Reviewed-by: Andy Shevchenko
Signed-off-by: Yisheng Xie
Acked-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Xie Yisheng
2018-07-02 17:25:40 +0800

19 Jun, 2018

1 commit

ff7c99171 cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0 ... Browse Code »

When scaling max/min settings are changed, internally they are converted
to a ratio using the max turbo 1 core turbo frequency. This works fine
when 1 core max is same irrespective of the core. But under Turbo 3.0,
this will not be the case. For example:
Core 0: max turbo pstate: 43 (4.3GHz)
Core 1: max turbo pstate: 45 (4.5GHz)
In this case 1 core turbo ratio will be maximum of all, so it will be
45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores
,then on core one it will be
= max_state * policy->max / max_freq;
= 43 * (4000000/4500000) = 38 (3.8GHz)
= 38
which is 200MHz less than the desired.
On core2, it will be correctly set to ratio 40 (4GHz). Same holds true
for scaling min frequency limit. So this requires usage of correct turbo
max frequency for core one, which in this case is 4.3GHz. So we need to
adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that
core.

This change uses the HWP capability of a core to adjust max turbo
frequency. But since Broadwell HWP doesn't use ratios in the HWP
capabilities, we have to use legacy max 1 core turbo ratio. This is not
a problem as the HWP capabilities don't differ among cores in Broadwell.
We need to check for non Broadwell CPU model for applying this change,
though.

Signed-off-by: Srinivas Pandruvada
Cc: 4.6+ # 4.6+
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-06-19 16:40:29 +0800

13 Jun, 2018

2 commits

d09fcecb0 Merge tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull more power management updates from Rafael Wysocki:
"These revert a recent PM core change that introduced a regression, fix
the build when the recently added Kryo cpufreq driver is selected, add
support for devices attached to multiple power domains to the generic
power domains (genpd) framework, add support for iowait boosting on
systens with hardware-managed P-states (HWP) enabled to the
intel_pstate driver, modify the behavior of the wakeup_count device
attribute in sysfs, fix a few issues and clean up some ugliness,
mostly in cpufreq (core and drivers) and in the cpupower utility.

Specifics:

- Revert a recent PM core change that attempted to fix an issue
related to device links, but introduced a regression (Rafael
Wysocki)

- Fix build when the recently added cpufreq driver for Kryo
processors is selected by making it possible to build that driver
as a module (Arnd Bergmann)

- Fix the long idle detection mechanism in the out-of-band (ondemand
and conservative) cpufreq governors (Chen Yu)

- Add support for devices in multiple power domains to the generic
power domains (genpd) framework (Ulf Hansson)

- Add support for iowait boosting on systems with hardware-managed
P-states (HWP) enabled to the intel_pstate driver and make it use
that feature on systems with Skylake Xeon processors as it is
reported to improve performance significantly on those systems
(Srinivas Pandruvada)

- Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq
drivers (Colin Ian King, Suman Anna, Sébastien Szymanski)

- Change the behavior of the wakeup_count device attribute in sysfs
to expose the number of events when the device might have aborted
system suspend in progress (Ravi Chandra Sadineni)

- Fix two minor issues in the cpupower utility (Abhishek Goel, Colin
Ian King)"

* tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "PM / runtime: Fixup reference counting of device link suppliers at probe"
cpufreq: imx6q: check speed grades for i.MX6ULL
cpufreq: governors: Fix long idle detection logic in load calculation
cpufreq: intel_pstate: enable boost for Skylake Xeon
PM / wakeup: Export wakeup_count instead of event_count via sysfs
PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains
PM / Domains: Add support for multi PM domains per device to genpd
PM / Domains: Split genpd_dev_pm_attach()
PM / Domains: Don't attach devices in genpd with multi PM domains
PM / Domains: dt: Allow power-domain property to be a list of specifiers
cpufreq: intel_pstate: New sysfs entry to control HWP boost
cpufreq: intel_pstate: HWP boost performance on IO wakeup
cpufreq: intel_pstate: Add HWP boost utility and sched util hooks
cpufreq: ti-cpufreq: Use devres managed API in probe()
cpufreq: ti-cpufreq: Fix an incorrect error return value
cpufreq: ACPI: make function acpi_cpufreq_fast_switch() static
cpufreq: kryo: allow building as a loadable module
cpupower : Fix header name to read idle state name
cpupower: fix spelling mistake: "logilename" -> "logfilename"

Linus Torvalds
2018-06-13 22:24:18 +0800
fad953ce0 treewide: Use array_size() in vzalloc() ... Browse Code »

The vzalloc() function has no 2-factor argument form, so multiplication
factors need to be wrapped in array_size(). This patch replaces cases of:

vzalloc(a * b)

with:
vzalloc(array_size(a, b))

as well as handling cases of:

vzalloc(a * b * c)

with:

vzalloc(array3_size(a, b, c))

This does, however, attempt to ignore constant size factors like:

vzalloc(4 * 1024)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
vzalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
vzalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
vzalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
vzalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
vzalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
vzalloc(
- sizeof(TYPE) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT_ID
+ array_size(COUNT_ID, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT_ID)
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT_ID
+ array_size(COUNT_ID, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT_CONST)
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT_CONST
+ array_size(COUNT_CONST, sizeof(THING))
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

vzalloc(
- SIZE * COUNT
+ array_size(COUNT, SIZE)
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
vzalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
vzalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
vzalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
vzalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
vzalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
vzalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
vzalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
vzalloc(C1 * C2 * C3, ...)
|
vzalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@

(
vzalloc(C1 * C2, ...)
|
vzalloc(
- E1 * E2
+ array_size(E1, E2)
, ...)
)

Signed-off-by: Kees Cook

Kees Cook
2018-06-13 07:19:22 +0800

08 Jun, 2018

1 commit

41ab43c9c cpufreq: intel_pstate: enable boost for Skylake Xeon ... Browse Code »

Enable HWP boost on Skylake server and workstations.

Reported-by: Mel Gorman
Tested-by: Giovanni Gherdovich
Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-06-08 17:10:26 +0800

06 Jun, 2018

3 commits

aaaece3de cpufreq: intel_pstate: New sysfs entry to control HWP boost ... Browse Code »

A new attribute is added to intel_pstate sysfs to enable/disable
HWP dynamic performance boost.

Reported-by: Mel Gorman
Tested-by: Giovanni Gherdovich
Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-06-06 14:37:52 +0800
52ccc4314 cpufreq: intel_pstate: HWP boost performance on IO wakeup ... Browse Code »

This change uses SCHED_CPUFREQ_IOWAIT flag to boost HWP performance.
Since SCHED_CPUFREQ_IOWAIT flag is set frequently, we don't start
boosting steps unless we see two consecutive flags in two ticks. This
avoids boosting due to IO because of regular system activities.

To avoid synchronization issues, the actual processing of the flag is
done on the local CPU callback.

Reported-by: Mel Gorman
Tested-by: Giovanni Gherdovich
Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-06-06 14:37:52 +0800
e0efd5be6 cpufreq: intel_pstate: Add HWP boost utility and sched util hooks ... Browse Code »

Added two utility functions to HWP boost up gradually and boost down to
the default cached HWP request values.

Boost up:
Boost up updates HWP request minimum value in steps. This minimum value
can reach upto at HWP request maximum values depends on how frequently,
this boost up function is called. At max, boost up will take three steps
to reach the maximum, depending on the current HWP request levels and HWP
capabilities. For example, if the current settings are:
If P0 (Turbo max) = P1 (Guaranteed max) = min
No boost at all.
If P0 (Turbo max) > P1 (Guaranteed max) = min
Should result in one level boost only for P0.
If P0 (Turbo max) = P1 (Guaranteed max) > min
Should result in two level boost:
(min + p1)/2 and P1.
If P0 (Turbo max) > P1 (Guaranteed max) > min
Should result in three level boost:
(min + p1)/2, P1 and P0.
We don't set any level between P0 and P1 as there is no guarantee that
they will be honored.

Boost down:
After the system is idle for hold time of 3ms, the HWP request is reset
to the default value from HWP init or user modified one via sysfs.

Caching of HWP Request and Capabilities
Store the HWP request value last set using MSR_HWP_REQUEST and read
MSR_HWP_CAPABILITIES. This avoid reading of MSRs in the boost utility
functions.

These boost utility functions calculated limits are based on the latest
HWP request value, which can be modified by setpolicy() callback. So if
user space modifies the minimum perf value, that will be accounted for
every time the boost up is called. There will be case when there can be
contention with the user modified minimum perf, in that case user value
will gain precedence. For example just before HWP_REQUEST MSR is updated
from setpolicy() callback, the boost up function is called via scheduler
tick callback. Here the cached MSR value is already the latest and limits
are updated based on the latest user limits, but on return the MSR write
callback called from setpolicy() callback will update the HWP_REQUEST
value. This will be used till next time the boost up function is called.

In addition add a variable to control HWP dynamic boosting. When HWP
dynamic boost is active then set the HWP specific update util hook. The
contents in the utility hooks will be filled in the subsequent patches.

Reported-by: Mel Gorman
Tested-by: Giovanni Gherdovich
Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-06-06 14:37:52 +0800

15 May, 2018

1 commit

50e9ffaba cpufreq: intel_pstate: allow trace in passive mode ... Browse Code »

Allow use of the trace_pstate_sample trace function
when the intel_pstate driver is in passive mode.
Since the core_busy and scaled_busy fields are not
used, and it might be desirable to know which path
through the driver was used, either intel_cpufreq_target
or intel_cpufreq_fast_switch, re-task the core_busy
field as a flag indicator.

The user can then use the intel_pstate_tracer.py utility
to summarize and plot the trace.

Note: The core_busy feild still goes by that name
in include/trace/events/power.h and within the
intel_pstate_tracer.py script and csv file headers,
but it is graphed as "performance", and called
core_avg_perf now in the intel_pstate driver.

Sometimes, in passive mode, the driver is not called for
many tens or even hundreds of seconds. The user
needs to understand, and not be confused by, this limitation.

Signed-off-by: Doug Smythies
Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Doug Smythies
2018-05-15 05:34:25 +0800

10 Apr, 2018

1 commit

b258dfeaf cpufreq: intel_pstate: Do not include debugfs.h ... Browse Code »

The intel_pstate driver doesn't use debugfs any more, so drop
linux/debugfs.h from the list of included headers in it.

Signed-off-by: Rafael J. Wysocki
Acked-by: Viresh Kumar

Rafael J. Wysocki
2018-04-10 14:38:02 +0800

08 Feb, 2018

1 commit

70f6bf2a3 cpufreq: intel_pstate: Enable HWP during system resume on CPU0 ... Browse Code »

When maxcpus=1 is in the kernel command line, the BP is responsible
for re-enabling the HWP - because currently only the APs invoke
intel_pstate_hwp_enable() during their online process - which might
put the system into unstable state after resume.

Fix this by enabling the HWP explicitly on BP during resume.

Reported-by: Doug Smythies
Suggested-by: Srinivas Pandruvada
Signed-off-by: Yu Chen
[ rjw: Subject/changelog, minor modifications ]
Signed-off-by: Rafael J. Wysocki

Chen Yu
2018-02-08 17:21:38 +0800

12 Jan, 2018

2 commits

d8de7a44e cpufreq: intel_pstate: Add Skylake servers support ... Browse Code »

Currently intel_pstate can function only in HWP mode on Skylake servers.
When HWP feature is not enabled on the processor then acpi-cpufreq is
driver is used.

Based on the power and performance tests using intel_pstate scaling
algorithm the results are comparable. But intel_pstate brings in
additional features:
- Display of turbo frequency range, which many users like to see
- Place limits in the turbo frequency range when platform allows

Since these tests are done only using non PID algorithm introduced in
kernel version 4.14, this patch is not a backport candidate. So each user
has to carefully weigh the benefits before he backports.

Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-01-12 01:57:20 +0800
dbd49b85e cpufreq: intel_pstate: Replace bxt_funcs with core_funcs ... Browse Code »

Since core_funcs and bxt_funcs have same set of callbacks, replace
bxt_funcs with core_funcs.

Signed-off-by: Srinivas Pandruvada
Signed-off-by: Rafael J. Wysocki

Srinivas Pandruvada
2018-01-12 01:57:20 +0800

06 Sep, 2017

1 commit

53ac64aac Merge tag 'acpi-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull ACPI updates from Rafael Wysocki:
"These include a usual ACPICA code update (this time to upstream
revision 20170728), a fix for a boot crash on some systems with
Thunderbolt devices connected at boot time, a rework of the handling
of PCI bridges when setting up device wakeup, new support for Apple
device properties, support for DMA configurations reported via ACPI on
ARM64, APEI-related updates, ACPI EC driver updates and assorted minor
modifications in several places.

Specifics:

- Update the ACPICA code in the kernel to upstream revision 20170728
including:
* Alias operator handling update (Bob Moore).
* Deferred resolution of reference package elements (Bob Moore).
* Support for the _DMA method in walk resources (Bob Moore).
* Tables handling update and support for deferred table
verification (Lv Zheng).
* Update of SMMU models for IORT (Robin Murphy).
* Compiler and disassembler updates (Alex James, Erik Schmauss,
Ganapatrao Kulkarni, James Morse).
* Tools updates (Erik Schmauss, Lv Zheng).
* Assorted minor fixes and cleanups (Bob Moore, Kees Cook, Lv
Zheng, Shao Ming).

- Rework the initialization of non-wakeup GPEs with method handlers
in order to address a boot crash on some systems with Thunderbolt
devices connected at boot time where we miss an early hotplug event
due to a delay in GPE enabling (Rafael Wysocki).

- Rework the handling of PCI bridges when setting up ACPI-based
device wakeup in order to avoid disabling wakeup for bridges
prematurely (Rafael Wysocki).

- Consolidate Apple DMI checks throughout the tree, add support for
Apple device properties to the device properties framework and use
these properties for the handling of I2C and SPI devices on Apple
systems (Lukas Wunner).

- Add support for _DMA to the ACPI-based device properties lookup
code and make it possible to use the information from there to
configure DMA regions on ARM64 systems (Lorenzo Pieralisi).

- Fix several issues in the APEI code, add support for exporting the
BERT error region over sysfs and update APEI MAINTAINERS entry with
reviewers information (Borislav Petkov, Dongjiu Geng, Loc Ho, Punit
Agrawal, Tony Luck, Yazen Ghannam).

- Fix a potential initialization ordering issue in the ACPI EC driver
and clean it up somewhat (Lv Zheng).

- Update the ACPI SPCR driver to extend the existing XGENE 8250
workaround in it to a new platform (m400) and to work around an
Xgene UART clock issue (Graeme Gregory).

- Add a new utility function to the ACPI core to support using ACPI
OEM ID / OEM Table ID / Revision for system identification in
blacklisting or similar and switch over the existing code already
using this information to this new interface (Toshi Kani).

- Fix an xpower PMIC issue related to GPADC reads that always return
0 without extra pin manipulations (Hans de Goede).

- Add statements to print debug messages in a couple of places in the
ACPI core for easier diagnostics (Rafael Wysocki).

- Clean up the ACPI processor driver slightly (Colin Ian King, Hanjun
Guo).

- Clean up the ACPI x86 boot code somewhat (Andy Shevchenko).

- Add a quirk for Dell OptiPlex 9020M to the ACPI backlight driver
(Alex Hung).

- Assorted fixes, cleanups and updates related to ACPI (Amitoj Kaur
Chawla, Bhumika Goyal, Frank Rowand, Jean Delvare, Punit Agrawal,
Ronald Tschalär, Sumeet Pawnikar)"

* tag 'acpi-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (75 commits)
ACPI / APEI: Suppress message if HEST not present
intel_pstate: convert to use acpi_match_platform_list()
ACPI / blacklist: add acpi_match_platform_list()
ACPI, APEI, EINJ: Subtract any matching Register Region from Trigger resources
ACPI: make device_attribute const
ACPI / sysfs: Extend ACPI sysfs to provide access to boot error region
ACPI: APEI: fix the wrong iteration of generic error status block
ACPI / processor: make function acpi_processor_check_duplicates() static
ACPI / EC: Clean up EC GPE mask flag
ACPI: EC: Fix possible issues related to EC initialization order
ACPI / PM: Add debug statements to acpi_pm_notify_handler()
ACPI: Add debug statements to acpi_global_event_handler()
ACPI / scan: Enable GPEs before scanning the namespace
ACPICA: Make it possible to enable runtime GPEs earlier
ACPICA: Dispatch active GPEs at init time
ACPI: SPCR: work around clock issue on xgene UART
ACPI: SPCR: extend XGENE 8250 workaround to m400
ACPI / LPSS: Don't abort ACPI scan on missing mem resource
mailbox: pcc: Drop uninformative output during boot
ACPI/IORT: Add IORT named component memory address limits
...

Linus Torvalds
2017-09-06 03:45:03 +0800

04 Sep, 2017

3 commits

ab271bc95 Merge branch 'intel_pstate' ... Browse Code »

* intel_pstate:
cpufreq: intel_pstate: Shorten a couple of long names
cpufreq: intel_pstate: Simplify intel_pstate_adjust_pstate()
cpufreq: intel_pstate: Improve IO performance with per-core P-states
cpufreq: intel_pstate: Drop INTEL_PSTATE_HWP_SAMPLING_INTERVAL
cpufreq: intel_pstate: Drop ->update_util from pstate_funcs
cpufreq: intel_pstate: Do not use PID-based P-state selection

Rafael J. Wysocki
2017-09-04 06:05:42 +0800
08a10002b Merge branch 'pm-cpufreq-sched' ... Browse Code »

* pm-cpufreq-sched:
cpufreq: schedutil: Always process remote callback with slow switching
cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily
cpufreq: Return 0 from ->fast_switch() on errors
cpufreq: Simplify cpufreq_can_do_remote_dvfs()
cpufreq: Process remote callbacks from any CPU if the platform permits
sched: cpufreq: Allow remote cpufreq callbacks
cpufreq: schedutil: Use unsigned int for iowait boost
cpufreq: schedutil: Make iowait boost more energy efficient

Rafael J. Wysocki
2017-09-04 06:05:22 +0800
bd87c8fb9 Merge branch 'pm-cpufreq' ... Browse Code »

* pm-cpufreq: (33 commits)
cpufreq: imx6q: Fix imx6sx low frequency support
cpufreq: speedstep-lib: make several arrays static, makes code smaller
cpufreq: ti: Fix 'of_node_put' being called twice in error handling path
cpufreq: dt-platdev: Drop few entries from whitelist
cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2
ARM: ux500: don't select CPUFREQ_DT
cpufreq: Convert to using %pOF instead of full_name
cpufreq: Cap the default transition delay value to 10 ms
cpufreq: dbx500: Delete obsolete driver
mfd: db8500-prcmu: Get rid of cpufreq dependency
cpufreq: enable the DT cpufreq driver on the Ux500
cpufreq: Loongson2: constify platform_device_id
cpufreq: dt: Add r8a7796 support to to use generic cpufreq driver
cpufreq: remove setting of policy->cpu in policy->cpus during init
cpufreq: mediatek: add support of cpufreq to MT7622 SoC
cpufreq: mediatek: add cleanups with the more generic naming
cpufreq: rcar: Add support for R8A7795 SoC
cpufreq: dt: Add rk3328 compatible to use generic cpufreq driver
cpufreq: s5pv210: add missing of_node_put()
cpufreq: Allow dynamic switching with CPUFREQ_ETERNAL latency
...

Rafael J. Wysocki
2017-09-04 06:05:13 +0800

29 Aug, 2017

1 commit

5e9323219 intel_pstate: convert to use acpi_match_platform_list() ... Browse Code »

Convert to use acpi_match_platform_list() for the platform check.
There is no change in functionality.

Signed-off-by: Toshi Kani
Acked-by: Srinivas Pandruvada
Reviewed-by: Borislav Petkov
Signed-off-by: Rafael J. Wysocki

Toshi Kani
2017-08-29 07:42:49 +0800

21 Aug, 2017

1 commit

57ccaf338 Merge back intel_pstate material for v4.14. Browse Code »

Rafael J. Wysocki
2017-08-21 07:50:20 +0800

18 Aug, 2017

1 commit

b20a3f3d8 cpufreq: remove setting of policy->cpu in policy->cpus during init ... Browse Code »

policy->cpu is copied into policy->cpus in cpufreq_online() before
calling into cpufreq_driver->init(). So there's no need to set the
same in the individual driver init() functions again.

This patch removes the redundant setting of policy->cpu in policy->cpus
in intel_pstate and cppc drivers.

Reported-by: Viresh Kumar
Signed-off-by: Sudeep Holla
Acked-by: Viresh Kumar
Signed-off-by: Rafael J. Wysocki

Sudeep Holla
2017-08-18 07:41:37 +0800

11 Aug, 2017

1 commit

c587c79f9 cpufreq: intel_pstate: report correct CPU frequencies during trace ... Browse Code »

The intel_pstate CPU frequency scaling driver has always
calculated CPU frequency incorrectly. Recent changes have
eliminted most of the issues, however the frequency reported
in the trace buffer, if used, is incorrect.

It remains desireable that cpu->pstate.scaling still be a nice
round number for things such as when setting max and min frequencies.
So the proposal is to just fix the reported frequency in the trace data.

Fixes what remains of [1].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=96521 # [1]
Signed-off-by: Doug Smythies
Signed-off-by: Rafael J. Wysocki

Doug Smythies
2017-08-11 07:25:53 +0800