Eric Lee / smarc-fsl-linux-kernel

29 Oct, 2018

25 commits

99e929e75 MLK-18428-01 driver: thermal: add tmu driver on imx8mm ... Browse Code »

add thermal driver on i.MX8MM

Signed-off-by: Bai Ping
Reviewed-by: Anson Huang

Bai Ping
2018-10-29 11:10:38 +0800
c415cddd6 MLK-18687-2 thermal: imx_sc: add status check for thermal zone ... Browse Code »

Add status check for thermal zones, ignore those thermal
zones with status set to "disabled".

Signed-off-by: Anson Huang
Acked-by: Leonard Crestez

Anson Huang
2018-10-29 11:10:38 +0800
5416af198 MLK-18648 thermal: improve imx sc thermal driver name ... Browse Code »

Improve i.MX system controller thermal driver name
by making it lower case.

Signed-off-by: Anson Huang
Reviewed-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800
4a3d636a2 MLK-18569 thermal: imx_sc: Fix interpreting tenths as millicelsius ... Browse Code »

Linux expects millicelsius but tenths are handled incorrectly.

Fixes: 10a2548b8b60 ("MLK-14972-02 driver: thermal: Add i.MX8QM/QXP thermal support")

Signed-off-by: Leonard Crestez
Acked-by: Anson Huang

Leonard Crestez
2018-10-29 11:10:38 +0800
2725fbc3a MLK-17698-5 thermal: imx_sc: add PMIC thermal sensor for i.MX8QM ... Browse Code »

Remove unused thermal sensors and add PMIC thermal sensors
for i.MX8QM.

Signed-off-by: Anson Huang
Reviewed-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800
1d5db2ff9 MLK-17698-3 thermal: imx_sc: add PMIC thermal sensors for i.MX8QXP ... Browse Code »

Add PMIC thermal sensors for i.MX8QXP.

Signed-off-by: Anson Huang
Reviewed-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800
2f8723943 MLK-17698-1 thermal: imx_sc: use system controller thermal sensor for A35 CPU ... Browse Code »

Now that SCFW (0d43db9 SCF-22: Move SCU controls to SYSTEM.
Allows AP to use SCU temp sensor.) exposes SCU's temp sensor
for AP, and it is placed more close to i.MX8QXP A35 core, so
it should be used as A35's CPU thermal sensor, add this change
and move DRC temp sensor to a new thermal zone.

Signed-off-by: Anson Huang
Reviewed-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800
4f85b4294 MLK-16526-2 thermal: qoriq: add buffer for passive cooling mechanism ... Browse Code »

On i.MX8MQ, When temperature exceeds passive point,
the cooling mechanism will be trigger and temperature
will begin to drop, to avoid back and forth surrounding
the passive point, here adds 10 C buffer for passive point,
that means when cooling mechanism is trigger, only after
the temperature drop to 10 C below the passive point,
the cooling mechanism will exit.

Signed-off-by: Anson Huang
Reviewed-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800
1c89dc608 MLK-16526-1 thermal: imx_sc: add buffer for passive cooling mechanism ... Browse Code »

On i.MX8QM/8QXP, When temperature exceeds passive point,
the cooling mechanism will be trigger and temperature
will begin to drop, to avoid back and forth surrounding
the passive point, here adds 10 C buffer for passive point,
that means when cooling mechanism is trigger, only after
the temperature drop to 10 C below the passive point,
the cooling mechanism will exit.

Signed-off-by: Anson Huang
Reviewed-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800
65e673b65 MLK-16470 thermal: imx_thermal: fix wrong thermal grade register read for MX7D ... Browse Code »

From MX7D Fuse Map v2.9, the thermal grade register is 0x440[7:6],
not 0x480[7:6] as before.

Fixes: 2045abb4391a ("MLK-11518-01 thermal: imx: add thermal support for imx7")
Reviewed-by: Bai Ping
Signed-off-by: Dong Aisheng

Dong Aisheng
2018-10-29 11:10:38 +0800
775ccd86b MLK-16415 thermal: imx_sc: add device cooling for all thermal zones ... Browse Code »

For system controller thermal devices, add device
cooling for all thermal zones, when temperature
exceeds passive trip point, thermal driver will
send out notification, all devices that register
device cooling notification can take actions to
cooling down the chip, such as for GPU, below message
will be printed out:

[ 581.284453] System is too hot. GPU3D will work at 1/64 clock.

And when temperature drops to below passive trip
point, GPU cooling action will be cancelled:

[ 578.300532] Hot alarm is canceled. GPU3D clock will return to 64/64

Signed-off-by: Anson Huang

Anson Huang
2018-10-29 11:10:38 +0800
411cb09e6 MLK-16372-1 thermal: imx_sc: add get_trend and set_trip_temp support ... Browse Code »

Add get_trend and set_trip_temp callback to support
cpu-freq cooling function.

Signed-off-by: Anson Huang

Anson Huang
2018-10-29 11:10:38 +0800
ff29eef53 MLK-16300 thermal: imx: avoid error message of get_temp when thermal zone is off ... Browse Code »

For i.MX system controller thermal, when some of the thermal
zones are powered off, the get temp will fail, and thermal driver
will return CPU thermal zone's temp instead. But current driver
will return A53 cluster for all cases, and A53 cluster may be
also off when booting up A72 cluster only, so below error message
will come out:

[ 475.606431] read temp sensor:0 failed
[ 475.610107] thermal thermal_zone0: failed to read out thermal zone (-22)

To avoid this error, for the case of thermal zones power off,
thermal driver can return current thread's CPU cluster temperature.

Signed-off-by: Anson Huang
Reviewed-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800
031d55506 MLK-16109-1 thermal: qoriq: add device cooling support ... Browse Code »

On i.MX8MQ, once temperautre exceeds hot threshold, some
modules like GPU etc. can reduce its frequency to cool down
the chip. All modules can register this device cooling
notifier to receive thermal HOT notification.

Signed-off-by: Anson Huang

Anson Huang
2018-10-29 11:10:38 +0800
ad1f676e0 MLK-16093-2 thermal: qoriq: add necessary callbacks for cooling support ... Browse Code »

Add get_trend and set_trip_temp to support i.MX8MQ cooling
device, get_trend is to customize cooling governor behavior,
once temperature exceeds passive trip, cooling device will work
at full function, and set_trip_temp is for updating trip
temp when do thermal test via modifying trip temp from sysfs.

Signed-off-by: Anson Huang

Anson Huang
2018-10-29 11:10:38 +0800
bb6c8b9aa MLK-15953-02 driver: thermal: Add tmu thermal driver support for i.mx8mq ... Browse Code »

On i.MX8MQ, we use the same TMU as on QorIQ platform, so the TMU driver
for QorIQ platform can be resued on our i.MX8M platform.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
c1196da0a MLK-15075 thermal: imx: fix temp read failure on i.mx7d ... Browse Code »

On i.MX7D, if the system enter LPSR mode, the tempmon module
will be power down, so the regiter's value is lost, so we need
to save the registers before suspend and restore the register after
resume back.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
b1f6fd0b6 MLK-14972-02 driver: thermal: Add i.MX8QM/QXP thermal support ... Browse Code »

Add i.MX8QM/QXP thermal driver support.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
5ccc424b8 MLK-12072 thermal: imx: enable tempmon finish bit check on imx7d TO1.1 ... Browse Code »

On i.MX7D TO1.0, the finish bit in tempmon module used for verify
the temp value is broken, so it can NOT be used for checking the temp
value. On TO1.1, this issue has been fixed, so we can use this bit
to verify if the temp value is valid.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
e5d7370e2 MLK-11705 thermal: imx: make the critical trip temp changable for test ... Browse Code »

In order to test the critical trip point funtion, the
critical trip point temp should be writable from userspace.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
64e708f5a MLK-11600 thermal: imx: notify thermal driver in low_bus_freq_mode ... Browse Code »

As thermal sensor alarm function needs PLL3 to be always on, but low power
idle needs all PLLs to be off, they are exclusive. Low power idle is only enabled
when system staying at low bus mode which means the overall system power consumption
is NOT high, thermal alarm function can be disabled in this mode to allow low power
idle to be entered, and thermal sensor will still use polling mechanism to monitor
the system temperature. Add busfreq notify to achieve this goal.
(this patch is copied from commit dd3d1e6c6ff0)

Also unregister the busfreq_notifier when the thermal driver is removed.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
3b8304e68 MLK-11518-03 thermal: imx enable devfreq cooling ... Browse Code »

Enable devfreq cooling to trigger GPU freq change when
hot trip is reached.

Make sure thermal driver loaded after cpufreq is loaded,
otherwise, cpu_cooling will not get valid cpufreq table,
hence cpu_cooling will be not working.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
74c40aa77 MLK-11518-02 thermal: imx: add .get_trend callback fn in thermal driver ... Browse Code »

add .get_trend callback to determine the thermal raise/fall trend,
when the temp great than a threshold, drop to the lowest trend
(THERMAL_TREND_DROP_FULL).

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
b9ca3eb59 MLK-11518-01 thermal: imx: add thermal support for imx7 ... Browse Code »

This pacth re-write part of the code the support i.MX6 and i.MX7
in thermal driver. the TEMPMON module in i.MX6 and i.MX7 can provide
the same funtion, but has different register offset and bitfield define.

Signed-off-by: Bai Ping

Bai Ping
2018-10-29 11:10:38 +0800
843d10018 MLK-11485 thermal: add device cooling for thermal driver ... Browse Code »

this patch is chery-picked from imx_3.14.y
(cherry picked from commit 51e376b469c)
ENGR00274056-1 thermal: add device cooling for thermal driver

cpu cooling is not enough when temperature is
too hot, as some devices may contribute a lot of heat
to SOC, such as GPU, so we need to add device cooling
as well, when system is too hot, devices can also take
their actions to lower SOC temperature.

when temperature cross the passive trip, device cooling
driver will send out notification, those devices who
register this devfreq_cooling notification will take
actions to lower SOC temperature.

Signed-off-by: Anson Huang
Signed-off-by: Shawn Guo
Signed-off-by: Bai Ping

Anson Huang
2018-10-29 11:10:38 +0800

04 Oct, 2018

1 commit

083be6fbf thermal: of-thermal: disable passive polling when thermal zone is disabled ... Browse Code »

[ Upstream commit 152395fd03d4ce1e535a75cdbf58105e50587611 ]

When thermal zone is in passive mode, disabling its mode from
sysfs is NOT taking effect at all, it is still polling the
temperature of the disabled thermal zone and handling all thermal
trips, it makes user confused. The disabling operation should
disable the thermal zone behavior completely, for both active and
passive mode, this patch clears the passive_delay when thermal
zone is disabled and restores it when it is enabled.

Signed-off-by: Anson Huang
Signed-off-by: Eduardo Valentin
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Anson Huang
2018-10-04 08:00:57 +0800

03 Aug, 2018

1 commit

b62ed0bbb thermal: exynos: fix setting rising_threshold for Exynos5433 ... Browse Code »

[ Upstream commit 8bfc218d0ebbabcba8ed2b8ec1831e0cf1f71629 ]

Add missing clearing of the previous value when setting rising
temperature threshold.

Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Eduardo Valentin
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Bartlomiej Zolnierkiewicz
2018-08-03 13:50:37 +0800

03 Jul, 2018

1 commit

0cf93821e thermal: bcm2835: Stop using printk format %pCr ... Browse Code »

commit bd2a07f71a1e2e198f8a30cb551d9defe422d83d upstream.

Printk format "%pCr" will be removed soon, as clk_get_rate() must not be
called in atomic context.

Replace it by printing the variable that already holds the clock rate.
Note that calling clk_get_rate() is safe here, as the code runs in task
context.

Link: http://lkml.kernel.org/r/1527845302-12159-3-git-send-email-geert+renesas@glider.be
To: Jia-Ju Bai
To: Jonathan Corbet
To: Michael Turquette
To: Stephen Boyd
To: Zhang Rui
To: Eduardo Valentin
To: Eric Anholt
To: Stefan Wahren
To: Greg Kroah-Hartman
Cc: Sergey Senozhatsky
Cc: Petr Mladek
Cc: Linus Torvalds
Cc: Steven Rostedt
Cc: linux-doc@vger.kernel.org
Cc: linux-clk@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 4.12+
Signed-off-by: Geert Uytterhoeven
Acked-by: Stefan Wahren
Signed-off-by: Petr Mladek
Signed-off-by: Greg Kroah-Hartman

Geert Uytterhoeven
2018-07-03 17:24:48 +0800

21 Jun, 2018

1 commit

b1d0907c6 thermal: int3403_thermal: Fix NULL pointer deref on module load / probe ... Browse Code »

[ Upstream commit 13b86f50eaaddaea4bdd2fe476fd12e6a0951add ]

Starting with kernel 4.17 thermal_cooling_device_register() will call the
get_max_state() op during register.

Since we deref priv->priv in int3403_get_max_state() this means we must
set priv->priv before calling thermal_cooling_device_register().

Signed-off-by: Hans de Goede
Signed-off-by: Zhang Rui
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Hans de Goede
2018-06-21 03:02:47 +0800

16 May, 2018

2 commits

db433f83a thermal: exynos: Propagate error value from tmu_read() ... Browse Code »

commit c8da6cdef57b459ac0fd5d9d348f8460a575ae90 upstream.

tmu_read() in case of Exynos4210 might return error for out of bound
values. Current code ignores such value, what leads to reporting critical
temperature value. Add proper error code propagation to exynos_get_temp()
function.

Signed-off-by: Marek Szyprowski
CC: stable@vger.kernel.org # v4.6+
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Eduardo Valentin
Signed-off-by: Greg Kroah-Hartman

Marek Szyprowski
2018-05-16 16:10:30 +0800
33df2f8a8 thermal: exynos: Reading temperature makes sense only when TMU is turned on ... Browse Code »

commit 88fc6f73fddf64eb507b04f7b2bd01d7291db514 upstream.

When thermal sensor is not yet enabled, reading temperature might return
random value. This might even result in stopping system booting when such
temperature is higher than the critical value. Fix this by checking if TMU
has been actually enabled before reading the temperature.

This change fixes booting of Exynos4210-based board with TMU enabled (for
example Samsung Trats board), which was broken since v4.4 kernel release.

Signed-off-by: Marek Szyprowski
Fixes: 9e4249b40340 ("thermal: exynos: Fix first temperature read after registering sensor")
CC: stable@vger.kernel.org # v4.6+
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Eduardo Valentin
Signed-off-by: Greg Kroah-Hartman

Marek Szyprowski
2018-05-16 16:10:30 +0800

24 Apr, 2018

1 commit

ecb67e92d thermal: imx: Fix race condition in imx_thermal_probe() ... Browse Code »

commit cf1ba1d73a33944d8c1a75370a35434bf146b8a7 upstream.

When device boots with T > T_trip_1 and requests interrupt,
the race condition takes place. The interrupt comes before
THERMAL_DEVICE_ENABLED is set. This leads to an attempt to
reading sensor value from irq and disabling the sensor, based on
the data->mode field, which expected to be THERMAL_DEVICE_ENABLED,
but still stays as THERMAL_DEVICE_DISABLED. Afher this issue
sensor is never re-enabled, as the driver state is wrong.

Fix this problem by setting the 'data' members prior to
requesting the interrupts.

Fixes: 37713a1e8e4c ("thermal: imx: implement thermal alarm interrupt handling")
Cc:
Signed-off-by: Mikhail Lappo
Signed-off-by: Fabio Estevam
Reviewed-by: Philipp Zabel
Acked-by: Dong Aisheng
Signed-off-by: Zhang Rui
Signed-off-by: Greg Kroah-Hartman

Mikhail Lappo
2018-04-24 15:36:34 +0800

12 Apr, 2018

2 commits

5dff63583 thermal: int3400_thermal: fix error handling in int3400_thermal_probe() ... Browse Code »

[ Upstream commit 0be86969ae385c5c944286bd9f66068525de15ee ]

There are resources that are not dealocated on failure path
in int3400_thermal_probe().

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov
Signed-off-by: Zhang Rui
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Alexey Khoroshilov
2018-04-12 18:32:21 +0800
ea40afb5c thermal: power_allocator: fix one race condition issue for thermal_instances list ... Browse Code »

[ Upstream commit a5de11d67dcd268b8d0beb73dc374de5e97f0caf ]

When invoking allow_maximum_power and traverse tz->thermal_instances,
we should grab thermal_zone_device->lock to avoid race condition. For
example, during the system reboot, if the mali GPU device implements
device shutdown callback and unregister GPU devfreq cooling device,
the deleted list head may be accessed to cause panic, as the following
log shows:

[ 33.551070] c3 25 (kworker/3:0) Unable to handle kernel paging request at virtual address dead000000000070
[ 33.566708] c3 25 (kworker/3:0) pgd = ffffffc0ed290000
[ 33.572071] c3 25 (kworker/3:0) [dead000000000070] *pgd=00000001ed292003, *pud=00000001ed292003, *pmd=0000000000000000
[ 33.581515] c3 25 (kworker/3:0) Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 33.599761] c3 25 (kworker/3:0) CPU: 3 PID: 25 Comm: kworker/3:0 Not tainted 4.4.35+ #912
[ 33.614137] c3 25 (kworker/3:0) Workqueue: events_freezable thermal_zone_device_check
[ 33.620245] c3 25 (kworker/3:0) task: ffffffc0f32e4200 ti: ffffffc0f32f0000 task.ti: ffffffc0f32f0000
[ 33.629466] c3 25 (kworker/3:0) PC is at power_allocator_throttle+0x7c8/0x8a4
[ 33.636609] c3 25 (kworker/3:0) LR is at power_allocator_throttle+0x808/0x8a4
[ 33.643742] c3 25 (kworker/3:0) pc : [] lr : [] pstate: 20000145
[ 33.652874] c3 25 (kworker/3:0) sp : ffffffc0f32f3bb0
[ 34.468519] c3 25 (kworker/3:0) Process kworker/3:0 (pid: 25, stack limit = 0xffffffc0f32f0020)
[ 34.477220] c3 25 (kworker/3:0) Stack: (0xffffffc0f32f3bb0 to 0xffffffc0f32f4000)
[ 34.819822] c3 25 (kworker/3:0) Call trace:
[ 34.824021] c3 25 (kworker/3:0) Exception stack(0xffffffc0f32f39c0 to 0xffffffc0f32f3af0)
[ 34.924993] c3 25 (kworker/3:0) [] power_allocator_throttle+0x7c8/0x8a4
[ 34.933184] c3 25 (kworker/3:0) [] handle_thermal_trip.part.25+0x70/0x224
[ 34.941545] c3 25 (kworker/3:0) [] thermal_zone_device_update+0xc0/0x20c
[ 34.949818] c3 25 (kworker/3:0) [] thermal_zone_device_check+0x20/0x2c
[ 34.957924] c3 25 (kworker/3:0) [] process_one_work+0x168/0x458
[ 34.965414] c3 25 (kworker/3:0) [] worker_thread+0x13c/0x4b4
[ 34.972650] c3 25 (kworker/3:0) [] kthread+0xe8/0xfc
[ 34.979187] c3 25 (kworker/3:0) [] ret_from_fork+0x10/0x40
[ 34.986244] c3 25 (kworker/3:0) Code: f9405e73 eb1302bf d102e273 54ffc460 (b9402a61)
[ 34.994339] c3 25 (kworker/3:0) ---[ end trace 32057901e3b7e1db ]---

Signed-off-by: Yi Zeng
Signed-off-by: Zhang Rui
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Yi Zeng
2018-04-12 18:32:12 +0800

25 Dec, 2017

4 commits

5431aef93 thermal/drivers/hisi: Fix multiple alarm interrupts firing ... Browse Code »

commit db2b0332608c8e648ea1e44727d36ad37cdb56cb upstream.

The DT specifies a threshold of 65000, we setup the register with a value in
the temperature resolution for the controller, 64656.

When we reach 64656, the interrupt fires, the interrupt is disabled. Then the
irq thread runs and calls thermal_zone_device_update() which will call in turn
hisi_thermal_get_temp().

The function will look if the temperature decreased, assuming it was more than
65000, but that is not the case because the current temperature is 64656
(because of the rounding when setting the threshold). This condition being
true, we re-enable the interrupt which fires immediately after exiting the irq
thread. That happens again and again until the temperature goes to more than
65000.

Potentially, there is here an interrupt storm if the temperature stabilizes at
this temperature. A very unlikely case but possible.

In any case, it does not make sense to handle dozens of alarm interrupt for
nothing.

Fix this by rounding the threshold value to the controller resolution so the
check against the threshold is consistent with the one set in the controller.

Signed-off-by: Daniel Lezcano
Reviewed-by: Leo Yan
Tested-by: Leo Yan
Signed-off-by: Eduardo Valentin
Signed-off-by: Kevin Wangtao
Signed-off-by: Greg Kroah-Hartman

Daniel Lezcano
2017-12-25 21:26:31 +0800
02c17c0f8 thermal/drivers/hisi: Simplify the temperature/step computation ... Browse Code »

commit 48880b979cdc9ef5a70af020f42b8ba1e51dbd34 upstream.

The step and the base temperature are fixed values, we can simplify the
computation by converting the base temperature to milli celsius and use a
pre-computed step value. That saves us a lot of mult + div for nothing at
runtime.

Take also the opportunity to change the function names to be consistent with
the rest of the code.

Signed-off-by: Daniel Lezcano
Reviewed-by: Leo Yan
Tested-by: Leo Yan
Signed-off-by: Eduardo Valentin
Signed-off-by: Kevin Wangtao
Signed-off-by: Greg Kroah-Hartman

Daniel Lezcano
2017-12-25 21:26:31 +0800
cf826c577 thermal/drivers/hisi: Fix kernel panic on alarm interrupt ... Browse Code »

commit 2cb4de785c40d4a2132cfc13e63828f5a28c3351 upstream.

The threaded interrupt for the alarm interrupt is requested before the
temperature controller is setup. This one can fire an interrupt immediately
leading to a kernel panic as the sensor data is not initialized.

In order to prevent that, move the threaded irq after the Tsensor is setup.

Signed-off-by: Daniel Lezcano
Reviewed-by: Leo Yan
Tested-by: Leo Yan
Signed-off-by: Eduardo Valentin
Signed-off-by: Kevin Wangtao
Signed-off-by: Greg Kroah-Hartman

Daniel Lezcano
2017-12-25 21:26:31 +0800
7254834c4 thermal/drivers/hisi: Fix missing interrupt enablement ... Browse Code »

commit c176b10b025acee4dc8f2ab1cd64eb73b5ccef53 upstream.

The interrupt for the temperature threshold is not enabled at the end of the
probe function, enable it after the setup is complete.

On the other side, the irq_enabled is not correctly set as we are checking if
the interrupt is masked where 'yes' means irq_enabled=false.

irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED,
&data->irq_enabled);

As we are always enabling the interrupt, it is pointless to check if
the interrupt is masked or not, just set irq_enabled to 'true'.

Signed-off-by: Daniel Lezcano
Reviewed-by: Leo Yan
Tested-by: Leo Yan
Signed-off-by: Eduardo Valentin
Signed-off-by: Kevin Wangtao
Signed-off-by: Greg Kroah-Hartman

Daniel Lezcano
2017-12-25 21:26:30 +0800

20 Dec, 2017

1 commit

5642562d0 thermal/drivers/step_wise: Fix temperature regulation misbehavior ... Browse Code »

[ Upstream commit 07209fcf33542c1ff1e29df2dbdf8f29cdaacb10 ]

There is a particular situation when the cooling device is cpufreq and the heat
dissipation is not efficient enough where the temperature increases little by
little until reaching the critical threshold and leading to a SoC reset.

The behavior is reproducible on a hikey6220 with bad heat dissipation (eg.
stacked with other boards).

Running a simple C program doing while(1); for each CPU of the SoC makes the
temperature to reach the passive regulation trip point and ends up to the
maximum allowed temperature followed by a reset.

This issue has been also reported by running the libhugetlbfs test suite.

What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz
while the temperature continues to grow.

It appears the step wise governor calls get_target_state() the first time with
the throttle set to true and the trend to 'raising'. The code selects logically
the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so
good. The temperature decreases immediately but still stays greater than the
trip point, then get_target_state() is called again, this time with the
throttle set to true *and* the trend to 'dropping'. From there the algorithm
assumes we have to step down the state and the cpu frequency jumps back to
1.2GHz. But the temperature is still higher than the trip point, so
get_target_state() is called with throttle=1 and trend='raising' again, we jump
to 900MHz, then get_target_state() is called with throttle=1 and
trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not
stabilizes and continues to increase.

[ 237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 237.922690] thermal cooling_device0: cur_state=0
[ 237.922701] thermal cooling_device0: old_target=0, target=1
[ 238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 238.026694] thermal cooling_device0: cur_state=1
[ 238.026707] thermal cooling_device0: old_target=1, target=0
[ 238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 238.134679] thermal cooling_device0: cur_state=0
[ 238.134690] thermal cooling_device0: old_target=0, target=1

In this situation the temperature continues to increase while the trend is
oscillating between 'dropping' and 'raising'. We need to keep the current state
untouched if the throttle is set, so the temperature can decrease or a higher
state could be selected, thus preventing this oscillation.

Keeping the next_target untouched when 'throttle' is true at 'dropping' time
fixes the issue.

The following traces show the governor does not change the next state if
trend==2 (dropping) and throttle==1.

[ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2306.128021] thermal cooling_device0: cur_state=0
[ 2306.128031] thermal cooling_device0: old_target=0, target=1
[ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 2306.232030] thermal cooling_device0: cur_state=1
[ 2306.232042] thermal cooling_device0: old_target=1, target=1
[ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1
[ 2306.336021] thermal cooling_device0: cur_state=1
[ 2306.336034] thermal cooling_device0: old_target=1, target=1
[ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2306.440022] thermal cooling_device0: cur_state=1
[ 2306.440034] thermal cooling_device0: old_target=1, target=0

[ ... ]

After a while, if the temperature continues to increase, the next state becomes
2 which is 720MHz on the hikey. That results in the temperature stabilizing
around the trip point.

[ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0
[ 2455.832019] thermal cooling_device0: cur_state=1
[ 2455.832032] thermal cooling_device0: old_target=1, target=1
[ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2455.936027] thermal cooling_device0: cur_state=1
[ 2455.936040] thermal cooling_device0: old_target=1, target=1
[ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2456.044023] thermal cooling_device0: cur_state=1
[ 2456.044036] thermal cooling_device0: old_target=1, target=1
[ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2456.148042] thermal cooling_device0: cur_state=1
[ 2456.148055] thermal cooling_device0: old_target=1, target=2
[ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2456.252058] thermal cooling_device0: cur_state=2
[ 2456.252075] thermal cooling_device0: old_target=2, target=1

IOW, this change is needed to keep the state for a cooling device if the
temperature trend is oscillating while the temperature increases slightly.

Without this change, the situation above leads to a catastrophic crash by a
hardware reset on hikey. This issue has been reported to happen on an OMAP
dra7xx also.

Signed-off-by: Daniel Lezcano
Cc: Keerthy
Cc: John Stultz
Cc: Leo Yan
Tested-by: Keerthy
Reviewed-by: Keerthy
Signed-off-by: Eduardo Valentin
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Daniel Lezcano
2017-12-20 17:10:28 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800