29 Oct, 2018

25 commits


04 Oct, 2018

1 commit

  • [ Upstream commit 152395fd03d4ce1e535a75cdbf58105e50587611 ]

    When thermal zone is in passive mode, disabling its mode from
    sysfs is NOT taking effect at all, it is still polling the
    temperature of the disabled thermal zone and handling all thermal
    trips, it makes user confused. The disabling operation should
    disable the thermal zone behavior completely, for both active and
    passive mode, this patch clears the passive_delay when thermal
    zone is disabled and restores it when it is enabled.

    Signed-off-by: Anson Huang
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Anson Huang
     

03 Aug, 2018

1 commit


03 Jul, 2018

1 commit

  • commit bd2a07f71a1e2e198f8a30cb551d9defe422d83d upstream.

    Printk format "%pCr" will be removed soon, as clk_get_rate() must not be
    called in atomic context.

    Replace it by printing the variable that already holds the clock rate.
    Note that calling clk_get_rate() is safe here, as the code runs in task
    context.

    Link: http://lkml.kernel.org/r/1527845302-12159-3-git-send-email-geert+renesas@glider.be
    To: Jia-Ju Bai
    To: Jonathan Corbet
    To: Michael Turquette
    To: Stephen Boyd
    To: Zhang Rui
    To: Eduardo Valentin
    To: Eric Anholt
    To: Stefan Wahren
    To: Greg Kroah-Hartman
    Cc: Sergey Senozhatsky
    Cc: Petr Mladek
    Cc: Linus Torvalds
    Cc: Steven Rostedt
    Cc: linux-doc@vger.kernel.org
    Cc: linux-clk@vger.kernel.org
    Cc: linux-pm@vger.kernel.org
    Cc: linux-serial@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-renesas-soc@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: stable@vger.kernel.org # 4.12+
    Signed-off-by: Geert Uytterhoeven
    Acked-by: Stefan Wahren
    Signed-off-by: Petr Mladek
    Signed-off-by: Greg Kroah-Hartman

    Geert Uytterhoeven
     

21 Jun, 2018

1 commit

  • [ Upstream commit 13b86f50eaaddaea4bdd2fe476fd12e6a0951add ]

    Starting with kernel 4.17 thermal_cooling_device_register() will call the
    get_max_state() op during register.

    Since we deref priv->priv in int3403_get_max_state() this means we must
    set priv->priv before calling thermal_cooling_device_register().

    Signed-off-by: Hans de Goede
    Signed-off-by: Zhang Rui
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Hans de Goede
     

16 May, 2018

2 commits

  • commit c8da6cdef57b459ac0fd5d9d348f8460a575ae90 upstream.

    tmu_read() in case of Exynos4210 might return error for out of bound
    values. Current code ignores such value, what leads to reporting critical
    temperature value. Add proper error code propagation to exynos_get_temp()
    function.

    Signed-off-by: Marek Szyprowski
    CC: stable@vger.kernel.org # v4.6+
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Greg Kroah-Hartman

    Marek Szyprowski
     
  • commit 88fc6f73fddf64eb507b04f7b2bd01d7291db514 upstream.

    When thermal sensor is not yet enabled, reading temperature might return
    random value. This might even result in stopping system booting when such
    temperature is higher than the critical value. Fix this by checking if TMU
    has been actually enabled before reading the temperature.

    This change fixes booting of Exynos4210-based board with TMU enabled (for
    example Samsung Trats board), which was broken since v4.4 kernel release.

    Signed-off-by: Marek Szyprowski
    Fixes: 9e4249b40340 ("thermal: exynos: Fix first temperature read after registering sensor")
    CC: stable@vger.kernel.org # v4.6+
    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Greg Kroah-Hartman

    Marek Szyprowski
     

24 Apr, 2018

1 commit

  • commit cf1ba1d73a33944d8c1a75370a35434bf146b8a7 upstream.

    When device boots with T > T_trip_1 and requests interrupt,
    the race condition takes place. The interrupt comes before
    THERMAL_DEVICE_ENABLED is set. This leads to an attempt to
    reading sensor value from irq and disabling the sensor, based on
    the data->mode field, which expected to be THERMAL_DEVICE_ENABLED,
    but still stays as THERMAL_DEVICE_DISABLED. Afher this issue
    sensor is never re-enabled, as the driver state is wrong.

    Fix this problem by setting the 'data' members prior to
    requesting the interrupts.

    Fixes: 37713a1e8e4c ("thermal: imx: implement thermal alarm interrupt handling")
    Cc:
    Signed-off-by: Mikhail Lappo
    Signed-off-by: Fabio Estevam
    Reviewed-by: Philipp Zabel
    Acked-by: Dong Aisheng
    Signed-off-by: Zhang Rui
    Signed-off-by: Greg Kroah-Hartman

    Mikhail Lappo
     

12 Apr, 2018

2 commits

  • [ Upstream commit 0be86969ae385c5c944286bd9f66068525de15ee ]

    There are resources that are not dealocated on failure path
    in int3400_thermal_probe().

    Found by Linux Driver Verification project (linuxtesting.org).

    Signed-off-by: Alexey Khoroshilov
    Signed-off-by: Zhang Rui
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Alexey Khoroshilov
     
  • [ Upstream commit a5de11d67dcd268b8d0beb73dc374de5e97f0caf ]

    When invoking allow_maximum_power and traverse tz->thermal_instances,
    we should grab thermal_zone_device->lock to avoid race condition. For
    example, during the system reboot, if the mali GPU device implements
    device shutdown callback and unregister GPU devfreq cooling device,
    the deleted list head may be accessed to cause panic, as the following
    log shows:

    [ 33.551070] c3 25 (kworker/3:0) Unable to handle kernel paging request at virtual address dead000000000070
    [ 33.566708] c3 25 (kworker/3:0) pgd = ffffffc0ed290000
    [ 33.572071] c3 25 (kworker/3:0) [dead000000000070] *pgd=00000001ed292003, *pud=00000001ed292003, *pmd=0000000000000000
    [ 33.581515] c3 25 (kworker/3:0) Internal error: Oops: 96000004 [#1] PREEMPT SMP
    [ 33.599761] c3 25 (kworker/3:0) CPU: 3 PID: 25 Comm: kworker/3:0 Not tainted 4.4.35+ #912
    [ 33.614137] c3 25 (kworker/3:0) Workqueue: events_freezable thermal_zone_device_check
    [ 33.620245] c3 25 (kworker/3:0) task: ffffffc0f32e4200 ti: ffffffc0f32f0000 task.ti: ffffffc0f32f0000
    [ 33.629466] c3 25 (kworker/3:0) PC is at power_allocator_throttle+0x7c8/0x8a4
    [ 33.636609] c3 25 (kworker/3:0) LR is at power_allocator_throttle+0x808/0x8a4
    [ 33.643742] c3 25 (kworker/3:0) pc : [] lr : [] pstate: 20000145
    [ 33.652874] c3 25 (kworker/3:0) sp : ffffffc0f32f3bb0
    [ 34.468519] c3 25 (kworker/3:0) Process kworker/3:0 (pid: 25, stack limit = 0xffffffc0f32f0020)
    [ 34.477220] c3 25 (kworker/3:0) Stack: (0xffffffc0f32f3bb0 to 0xffffffc0f32f4000)
    [ 34.819822] c3 25 (kworker/3:0) Call trace:
    [ 34.824021] c3 25 (kworker/3:0) Exception stack(0xffffffc0f32f39c0 to 0xffffffc0f32f3af0)
    [ 34.924993] c3 25 (kworker/3:0) [] power_allocator_throttle+0x7c8/0x8a4
    [ 34.933184] c3 25 (kworker/3:0) [] handle_thermal_trip.part.25+0x70/0x224
    [ 34.941545] c3 25 (kworker/3:0) [] thermal_zone_device_update+0xc0/0x20c
    [ 34.949818] c3 25 (kworker/3:0) [] thermal_zone_device_check+0x20/0x2c
    [ 34.957924] c3 25 (kworker/3:0) [] process_one_work+0x168/0x458
    [ 34.965414] c3 25 (kworker/3:0) [] worker_thread+0x13c/0x4b4
    [ 34.972650] c3 25 (kworker/3:0) [] kthread+0xe8/0xfc
    [ 34.979187] c3 25 (kworker/3:0) [] ret_from_fork+0x10/0x40
    [ 34.986244] c3 25 (kworker/3:0) Code: f9405e73 eb1302bf d102e273 54ffc460 (b9402a61)
    [ 34.994339] c3 25 (kworker/3:0) ---[ end trace 32057901e3b7e1db ]---

    Signed-off-by: Yi Zeng
    Signed-off-by: Zhang Rui
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Yi Zeng
     

25 Dec, 2017

4 commits

  • commit db2b0332608c8e648ea1e44727d36ad37cdb56cb upstream.

    The DT specifies a threshold of 65000, we setup the register with a value in
    the temperature resolution for the controller, 64656.

    When we reach 64656, the interrupt fires, the interrupt is disabled. Then the
    irq thread runs and calls thermal_zone_device_update() which will call in turn
    hisi_thermal_get_temp().

    The function will look if the temperature decreased, assuming it was more than
    65000, but that is not the case because the current temperature is 64656
    (because of the rounding when setting the threshold). This condition being
    true, we re-enable the interrupt which fires immediately after exiting the irq
    thread. That happens again and again until the temperature goes to more than
    65000.

    Potentially, there is here an interrupt storm if the temperature stabilizes at
    this temperature. A very unlikely case but possible.

    In any case, it does not make sense to handle dozens of alarm interrupt for
    nothing.

    Fix this by rounding the threshold value to the controller resolution so the
    check against the threshold is consistent with the one set in the controller.

    Signed-off-by: Daniel Lezcano
    Reviewed-by: Leo Yan
    Tested-by: Leo Yan
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Kevin Wangtao
    Signed-off-by: Greg Kroah-Hartman

    Daniel Lezcano
     
  • commit 48880b979cdc9ef5a70af020f42b8ba1e51dbd34 upstream.

    The step and the base temperature are fixed values, we can simplify the
    computation by converting the base temperature to milli celsius and use a
    pre-computed step value. That saves us a lot of mult + div for nothing at
    runtime.

    Take also the opportunity to change the function names to be consistent with
    the rest of the code.

    Signed-off-by: Daniel Lezcano
    Reviewed-by: Leo Yan
    Tested-by: Leo Yan
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Kevin Wangtao
    Signed-off-by: Greg Kroah-Hartman

    Daniel Lezcano
     
  • commit 2cb4de785c40d4a2132cfc13e63828f5a28c3351 upstream.

    The threaded interrupt for the alarm interrupt is requested before the
    temperature controller is setup. This one can fire an interrupt immediately
    leading to a kernel panic as the sensor data is not initialized.

    In order to prevent that, move the threaded irq after the Tsensor is setup.

    Signed-off-by: Daniel Lezcano
    Reviewed-by: Leo Yan
    Tested-by: Leo Yan
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Kevin Wangtao
    Signed-off-by: Greg Kroah-Hartman

    Daniel Lezcano
     
  • commit c176b10b025acee4dc8f2ab1cd64eb73b5ccef53 upstream.

    The interrupt for the temperature threshold is not enabled at the end of the
    probe function, enable it after the setup is complete.

    On the other side, the irq_enabled is not correctly set as we are checking if
    the interrupt is masked where 'yes' means irq_enabled=false.

    irq_get_irqchip_state(data->irq, IRQCHIP_STATE_MASKED,
    &data->irq_enabled);

    As we are always enabling the interrupt, it is pointless to check if
    the interrupt is masked or not, just set irq_enabled to 'true'.

    Signed-off-by: Daniel Lezcano
    Reviewed-by: Leo Yan
    Tested-by: Leo Yan
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Kevin Wangtao
    Signed-off-by: Greg Kroah-Hartman

    Daniel Lezcano
     

20 Dec, 2017

1 commit

  • [ Upstream commit 07209fcf33542c1ff1e29df2dbdf8f29cdaacb10 ]

    There is a particular situation when the cooling device is cpufreq and the heat
    dissipation is not efficient enough where the temperature increases little by
    little until reaching the critical threshold and leading to a SoC reset.

    The behavior is reproducible on a hikey6220 with bad heat dissipation (eg.
    stacked with other boards).

    Running a simple C program doing while(1); for each CPU of the SoC makes the
    temperature to reach the passive regulation trip point and ends up to the
    maximum allowed temperature followed by a reset.

    This issue has been also reported by running the libhugetlbfs test suite.

    What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz
    while the temperature continues to grow.

    It appears the step wise governor calls get_target_state() the first time with
    the throttle set to true and the trend to 'raising'. The code selects logically
    the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so
    good. The temperature decreases immediately but still stays greater than the
    trip point, then get_target_state() is called again, this time with the
    throttle set to true *and* the trend to 'dropping'. From there the algorithm
    assumes we have to step down the state and the cpu frequency jumps back to
    1.2GHz. But the temperature is still higher than the trip point, so
    get_target_state() is called with throttle=1 and trend='raising' again, we jump
    to 900MHz, then get_target_state() is called with throttle=1 and
    trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not
    stabilizes and continues to increase.

    [ 237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
    [ 237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
    [ 237.922690] thermal cooling_device0: cur_state=0
    [ 237.922701] thermal cooling_device0: old_target=0, target=1
    [ 238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
    [ 238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
    [ 238.026694] thermal cooling_device0: cur_state=1
    [ 238.026707] thermal cooling_device0: old_target=1, target=0
    [ 238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
    [ 238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
    [ 238.134679] thermal cooling_device0: cur_state=0
    [ 238.134690] thermal cooling_device0: old_target=0, target=1

    In this situation the temperature continues to increase while the trend is
    oscillating between 'dropping' and 'raising'. We need to keep the current state
    untouched if the throttle is set, so the temperature can decrease or a higher
    state could be selected, thus preventing this oscillation.

    Keeping the next_target untouched when 'throttle' is true at 'dropping' time
    fixes the issue.

    The following traces show the governor does not change the next state if
    trend==2 (dropping) and throttle==1.

    [ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
    [ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
    [ 2306.128021] thermal cooling_device0: cur_state=0
    [ 2306.128031] thermal cooling_device0: old_target=0, target=1
    [ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
    [ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
    [ 2306.232030] thermal cooling_device0: cur_state=1
    [ 2306.232042] thermal cooling_device0: old_target=1, target=1
    [ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
    [ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1
    [ 2306.336021] thermal cooling_device0: cur_state=1
    [ 2306.336034] thermal cooling_device0: old_target=1, target=1
    [ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
    [ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
    [ 2306.440022] thermal cooling_device0: cur_state=1
    [ 2306.440034] thermal cooling_device0: old_target=1, target=0

    [ ... ]

    After a while, if the temperature continues to increase, the next state becomes
    2 which is 720MHz on the hikey. That results in the temperature stabilizing
    around the trip point.

    [ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
    [ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0
    [ 2455.832019] thermal cooling_device0: cur_state=1
    [ 2455.832032] thermal cooling_device0: old_target=1, target=1
    [ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
    [ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
    [ 2455.936027] thermal cooling_device0: cur_state=1
    [ 2455.936040] thermal cooling_device0: old_target=1, target=1
    [ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
    [ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
    [ 2456.044023] thermal cooling_device0: cur_state=1
    [ 2456.044036] thermal cooling_device0: old_target=1, target=1
    [ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
    [ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
    [ 2456.148042] thermal cooling_device0: cur_state=1
    [ 2456.148055] thermal cooling_device0: old_target=1, target=2
    [ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
    [ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
    [ 2456.252058] thermal cooling_device0: cur_state=2
    [ 2456.252075] thermal cooling_device0: old_target=2, target=1

    IOW, this change is needed to keep the state for a cooling device if the
    temperature trend is oscillating while the temperature increases slightly.

    Without this change, the situation above leads to a catastrophic crash by a
    hardware reset on hikey. This issue has been reported to happen on an OMAP
    dra7xx also.

    Signed-off-by: Daniel Lezcano
    Cc: Keerthy
    Cc: John Stultz
    Cc: Leo Yan
    Tested-by: Keerthy
    Reviewed-by: Keerthy
    Signed-off-by: Eduardo Valentin
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Daniel Lezcano
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman