08 Oct, 2020

1 commit

  • * tag 'v5.4.70': (3051 commits)
    Linux 5.4.70
    netfilter: ctnetlink: add a range check for l3/l4 protonum
    ep_create_wakeup_source(): dentry name can change under you...
    ...

    Conflicts:
    arch/arm/mach-imx/pm-imx6.c
    arch/arm64/boot/dts/freescale/imx8mm-evk.dts
    arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
    drivers/crypto/caam/caamalg.c
    drivers/gpu/drm/imx/dw_hdmi-imx.c
    drivers/gpu/drm/imx/imx-ldb.c
    drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
    drivers/mmc/host/sdhci-esdhc-imx.c
    drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
    drivers/net/ethernet/freescale/enetc/enetc.c
    drivers/net/ethernet/freescale/enetc/enetc_pf.c
    drivers/thermal/imx_thermal.c
    drivers/usb/cdns3/ep0.c
    drivers/xen/swiotlb-xen.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c

    Signed-off-by: Jason Liu

    Jason Liu
     

01 Oct, 2020

1 commit

  • [ Upstream commit 39056e8a989ef52486e063e34b4822b341e47b0e ]

    If the common register memory resource is not available the driver needs
    to fail gracefully to disable PM. Instead of returning the error
    directly store it in ret and use the already existing error path.

    Signed-off-by: Niklas Söderlund
    Reviewed-by: Geert Uytterhoeven
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20200310114709.1483860-1-niklas.soderlund+renesas@ragnatech.se
    Signed-off-by: Sasha Levin

    Niklas Söderlund
     

10 Sep, 2020

2 commits

  • [ Upstream commit 0ffdab6f2dea9e23ec33230de24e492ff0b186d9 ]

    Currently driver is suppressing the negative temperature
    readings from the vadc. Consumers of the thermal zones need
    to read the negative temperature too. Don't suppress the
    readings.

    Fixes: c610afaa21d3c6e ("thermal: Add QPNP PMIC temperature alarm driver")
    Signed-off-by: Veera Vegivada
    Signed-off-by: Guru Das Srinagesh
    Reviewed-by: Stephen Boyd
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/944856eb819081268fab783236a916257de120e4.1596040416.git.gurus@codeaurora.org
    Signed-off-by: Sasha Levin

    Veera Vegivada
     
  • [ Upstream commit 30d24faba0532d6972df79a1bf060601994b5873 ]

    We can sometimes get bogus thermal shutdowns on omap4430 at least with
    droid4 running idle with a battery charger connected:

    thermal thermal_zone0: critical temperature reached (143 C), shutting down

    Dumping out the register values shows we can occasionally get a 0x7f value
    that is outside the TRM listed values in the ADC conversion table. And then
    we get a normal value when reading again after that. Reading the register
    multiple times does not seem help avoiding the bogus values as they stay
    until the next sample is ready.

    Looking at the TRM chapter "18.4.10.2.3 ADC Codes Versus Temperature", we
    should have values from 13 to 107 listed with a total of 95 values. But
    looking at the omap4430_adc_to_temp array, the values are off, and the
    end values are missing. And it seems that the 4430 ADC table is similar
    to omap3630 rather than omap4460.

    Let's fix the issue by using values based on the omap3630 table and just
    ignoring invalid values. Compared to the 4430 TRM, the omap3630 table has
    the missing values added while the TRM table only shows every second
    value.

    Note that sometimes the ADC register values within the valid table can
    also be way off for about 1 out of 10 values. But it seems that those
    just show about 25 C too low values rather than too high values. So those
    do not cause a bogus thermal shutdown.

    Fixes: 1a31270e54d7 ("staging: omap-thermal: add OMAP4 data structures")
    Cc: Merlijn Wajer
    Cc: Pavel Machek
    Cc: Sebastian Reichel
    Signed-off-by: Tony Lindgren
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20200706183338.25622-1-tony@atomide.com
    Signed-off-by: Sasha Levin

    Tony Lindgren
     

19 Aug, 2020

1 commit


22 Jul, 2020

4 commits

  • commit 371a3bc79c11b707d7a1b7a2c938dc3cc042fffb upstream.

    The function cpu_power_to_freq is used to find a frequency and set the
    cooling device to consume at most the power to be converted. For example,
    if the power to be converted is 80mW, and the em table is as follow.
    struct em_cap_state table[] = {
    /* KHz mW */
    { 1008000, 36, 0 },
    { 1200000, 49, 0 },
    { 1296000, 59, 0 },
    { 1416000, 72, 0 },
    { 1512000, 86, 0 },
    };
    The target frequency should be 1416000KHz, not 1512000KHz.

    Fixes: 349d39dc5739 ("thermal: cpu_cooling: merge frequency and power tables")
    Cc: # v4.13+
    Signed-off-by: Finley Xiao
    Acked-by: Viresh Kumar
    Reviewed-by: Amit Kucheria
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20200619090825.32747-1-finley.xiao@rock-chips.com
    Signed-off-by: Viresh Kumar
    Signed-off-by: Greg Kroah-Hartman

    Finley Xiao
     
  • commit f3d7fb38976b1b0a8462ba1c7cbd404ddfaad086 upstream.

    Downgrade "Unsupported event" message from dev_err to dev_dbg to avoid
    flooding with this message on some platforms.

    Cc: stable@vger.kernel.org # v5.4+
    Suggested-by: Zhang Rui
    Signed-off-by: Alex Hung
    [ rzhang: fix typo in changelog ]
    Signed-off-by: Zhang Rui
    Link: https://lore.kernel.org/r/20200615223957.183153-1-alex.hung@canonical.com
    Signed-off-by: Greg Kroah-Hartman

    Alex Hung
     
  • [ Upstream commit a8f62f183021be389561570ab5f8c701a5e70298 ]

    This reverts commit eb9aecd90d1a39601e91cd08b90d5fee51d321a6

    The above patch is supposed to fix a register index error on mt2701. It
    is not clear if the problem solved is a hang or just an invalid value
    returned, my guess is the second. The patch introduces, though, a new
    hang on MT8173 device making them unusable. So, seems reasonable, revert
    the patch because introduces a worst issue.

    The reason I send a revert instead of trying to fix the issue for MT8173
    is because the information needed to fix the issue is in the datasheet
    and is not public. So I am not really able to fix it.

    Fixes the following bug when CONFIG_MTK_THERMAL is set on MT8173
    devices.

    [ 2.222488] Unable to handle kernel paging request at virtual address ffff8000125f5001
    [ 2.230421] Mem abort info:
    [ 2.233207] ESR = 0x96000021
    [ 2.236261] EC = 0x25: DABT (current EL), IL = 32 bits
    [ 2.241571] SET = 0, FnV = 0
    [ 2.244623] EA = 0, S1PTW = 0
    [ 2.247762] Data abort info:
    [ 2.250640] ISV = 0, ISS = 0x00000021
    [ 2.254473] CM = 0, WnR = 0
    [ 2.257544] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041850000
    [ 2.264251] [ffff8000125f5001] pgd=000000013ffff003, pud=000000013fffe003, pmd=000000013fff9003, pte=006800001100b707
    [ 2.274867] Internal error: Oops: 96000021 [#1] PREEMPT SMP
    [ 2.280432] Modules linked in:
    [ 2.283483] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc6+ #162
    [ 2.289914] Hardware name: Google Elm (DT)
    [ 2.294003] pstate: 20000005 (nzCv daif -PAN -UAO)
    [ 2.298792] pc : mtk_read_temp+0xb8/0x1c8
    [ 2.302793] lr : mtk_read_temp+0x7c/0x1c8
    [ 2.306794] sp : ffff80001003b930
    [ 2.310100] x29: ffff80001003b930 x28: 0000000000000000
    [ 2.315404] x27: 0000000000000002 x26: ffff0000f9550b10
    [ 2.320709] x25: ffff0000f9550a80 x24: 0000000000000090
    [ 2.326014] x23: ffff80001003ba24 x22: 00000000610344c0
    [ 2.331318] x21: 0000000000002710 x20: 00000000000001f4
    [ 2.336622] x19: 0000000000030d40 x18: ffff800011742ec0
    [ 2.341926] x17: 0000000000000001 x16: 0000000000000001
    [ 2.347230] x15: ffffffffffffffff x14: ffffff0000000000
    [ 2.352535] x13: ffffffffffffffff x12: 0000000000000028
    [ 2.357839] x11: 0000000000000003 x10: ffff800011295ec8
    [ 2.363143] x9 : 000000000000291b x8 : 0000000000000002
    [ 2.368447] x7 : 00000000000000a8 x6 : 0000000000000004
    [ 2.373751] x5 : 0000000000000000 x4 : ffff800011295cb0
    [ 2.379055] x3 : 0000000000000002 x2 : ffff8000125f5001
    [ 2.384359] x1 : 0000000000000001 x0 : ffff0000f9550a80
    [ 2.389665] Call trace:
    [ 2.392105] mtk_read_temp+0xb8/0x1c8
    [ 2.395760] of_thermal_get_temp+0x2c/0x40
    [ 2.399849] thermal_zone_get_temp+0x78/0x160
    [ 2.404198] thermal_zone_device_update.part.0+0x3c/0x1f8
    [ 2.409589] thermal_zone_device_update+0x34/0x48
    [ 2.414286] of_thermal_set_mode+0x58/0x88
    [ 2.418375] thermal_zone_of_sensor_register+0x1a8/0x1d8
    [ 2.423679] devm_thermal_zone_of_sensor_register+0x64/0xb0
    [ 2.429242] mtk_thermal_probe+0x690/0x7d0
    [ 2.433333] platform_drv_probe+0x5c/0xb0
    [ 2.437335] really_probe+0xe4/0x448
    [ 2.440901] driver_probe_device+0xe8/0x140
    [ 2.445077] device_driver_attach+0x7c/0x88
    [ 2.449252] __driver_attach+0xac/0x178
    [ 2.453082] bus_for_each_dev+0x78/0xc8
    [ 2.456909] driver_attach+0x2c/0x38
    [ 2.460476] bus_add_driver+0x14c/0x230
    [ 2.464304] driver_register+0x6c/0x128
    [ 2.468131] __platform_driver_register+0x50/0x60
    [ 2.472831] mtk_thermal_driver_init+0x24/0x30
    [ 2.477268] do_one_initcall+0x50/0x298
    [ 2.481098] kernel_init_freeable+0x1ec/0x264
    [ 2.485450] kernel_init+0x1c/0x110
    [ 2.488931] ret_from_fork+0x10/0x1c
    [ 2.492502] Code: f9401081 f9400402 b8a67821 8b010042 (b9400042)
    [ 2.498599] ---[ end trace e43e3105ed27dc99 ]---
    [ 2.503367] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
    [ 2.511020] SMP: stopping secondary CPUs
    [ 2.514941] Kernel Offset: disabled
    [ 2.518421] CPU features: 0x090002,25006005
    [ 2.522595] Memory Limit: none
    [ 2.525644] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--

    Cc: Michael Kao
    Fixes: eb9aecd90d1a ("thermal: mediatek: fix register index error")
    Signed-off-by: Enric Balletbo i Serra
    Reviewed-by: Matthias Brugger
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20200707103412.1010823-1-enric.balletbo@collabora.com
    Signed-off-by: Sasha Levin

    Enric Balletbo i Serra
     
  • [ Upstream commit b45fd13be340e4ed0a2a9673ba299eb2a71ba829 ]

    After finishing using cpu node got from of_get_cpu_node(), of_node_put()
    needs to be called.

    Signed-off-by: Anson Huang
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/1585232945-23368-1-git-send-email-Anson.Huang@nxp.com
    Signed-off-by: Sasha Levin

    Anson Huang
     

09 Jul, 2020

2 commits

  • [ Upstream commit 5f8f06425a0dcdad7bedbb77e67f5c65ab4dacfc ]

    As description for DIV_ROUND_CLOSEST in file include/linux/kernel.h.
    "Result is undefined for negative divisors if the dividend variable
    type is unsigned and for negative dividends if the divisor variable
    type is unsigned."

    In current code, the FIXPT_DIV uses DIV_ROUND_CLOSEST but has not
    checked sign of divisor before using. It makes undefined temperature
    value in case the value is negative.

    This patch fixes to satisfy DIV_ROUND_CLOSEST description
    and fix bug too. Note that the variable name "reg" is not good
    because it should be the same type as rcar_gen3_thermal_read().
    However, it's better to rename the "reg" in a further patch as
    cleanup.

    Signed-off-by: Van Do
    Signed-off-by: Dien Pham
    [shimoda: minor fixes, add Fixes tag]
    Fixes: 564e73d283af ("thermal: rcar_gen3_thermal: Add R-Car Gen3 thermal driver")
    Signed-off-by: Yoshihiro Shimoda
    Reviewed-by: Niklas Soderlund
    Tested-by: Niklas Soderlund
    Reviewed-by: Amit Kucheria
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/1593085099-2057-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
    Signed-off-by: Sasha Levin

    Dien Pham
     
  • [ Upstream commit 14533a5a6c12e8d7de79d309d4085bf186058fe1 ]

    MT8183_NUM_ZONES should be set to 1
    because MT8183 doesn't have multiple banks.

    Fixes: a4ffe6b52d27 ("thermal: mediatek: add support for MT8183")
    Signed-off-by: Michael Kao
    Signed-off-by: Hsin-Yi Wang
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20200323121537.22697-6-michael.kao@mediatek.com
    Signed-off-by: Sasha Levin

    Michael Kao
     

24 Jun, 2020

1 commit

  • [ Upstream commit 7440f518dad9d861d76c64956641eeddd3586f75 ]

    On error the function ti_bandgap_get_sensor_data() returns the error
    code in ERR_PTR() but we only checked if the return value is NULL or
    not. And, so we can dereference an error code inside ERR_PTR.
    While at it, convert a check to IS_ERR_OR_NULL.

    Signed-off-by: Sudip Mukherjee
    Reviewed-by: Amit Kucheria
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20200424161944.6044-1-sudipm.mukherjee@gmail.com
    Signed-off-by: Sasha Levin

    Sudip Mukherjee
     

19 Mar, 2020

1 commit


09 Mar, 2020

1 commit


08 Mar, 2020

1 commit

  • Merge Linux stable release v5.4.24 into imx_5.4.y

    * tag 'v5.4.24': (3306 commits)
    Linux 5.4.24
    blktrace: Protect q->blk_trace with RCU
    kvm: nVMX: VMWRITE checks unsupported field before read-only field
    ...

    Signed-off-by: Jason Liu

    Conflicts:
    arch/arm/boot/dts/imx6sll-evk.dts
    arch/arm/boot/dts/imx7ulp.dtsi
    arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
    drivers/clk/imx/clk-composite-8m.c
    drivers/gpio/gpio-mxc.c
    drivers/irqchip/Kconfig
    drivers/mmc/host/sdhci-of-esdhc.c
    drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
    drivers/net/can/flexcan.c
    drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
    drivers/net/ethernet/mscc/ocelot.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/realtek.c
    drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/tee/optee/shm_pool.c
    drivers/usb/cdns3/gadget.c
    kernel/sched/cpufreq.c
    net/core/xdp.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c
    sound/soc/sof/core.c
    sound/soc/sof/imx/Kconfig
    sound/soc/sof/loader.c

    Jason Liu
     

05 Mar, 2020

2 commits

  • commit e1ff6fc22f19e2af8adbad618526b80067911d40 upstream.

    At the time the brcmstb_thermal driver and its binding were merged, the
    DT binding did not make the coefficients properties a mandatory one,
    therefore all users of the brcmstb_thermal driver out there have a non
    functional implementation with zero coefficients. Even if these
    properties were provided, the formula used for computation is incorrect.

    The coefficients are entirely process specific (right now, only 28nm is
    supported) and not board or SoC specific, it is therefore appropriate to
    hard code them in the driver given the compatibility string we are
    probed with which has to be updated whenever a new process is
    introduced.

    We remove the existing coefficients definition since subsequent patches
    are going to add support for a new process and will introduce new
    coefficients as well.

    Fixes: 9e03cf1b2dd5 ("thermal: add brcmstb AVS TMON driver")
    Signed-off-by: Florian Fainelli
    Reviewed-by: Amit Kucheria
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20200114190607.29339-2-f.fainelli@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Florian Fainelli
     
  • commit c56dcfa3d4d0f49f0c37cd24886aa86db7aa7f30 upstream.

    We are not interested in getting this debug print on our
    console all the time.

    Cc: Daniel Lezcano
    Cc: Stephan Gerhold
    Fixes: 6c375eccded4 ("thermal: db8500: Rewrite to be a pure OF sensor")
    Signed-off-by: Linus Walleij
    Reviewed-by: Stephan Gerhold
    Signed-off-by: Daniel Lezcano
    Link: https://lore.kernel.org/r/20191119074650.2664-1-linus.walleij@linaro.org
    Signed-off-by: Greg Kroah-Hartman

    Linus Walleij
     

21 Jan, 2020

1 commit


16 Dec, 2019

1 commit

  • This is the 5.4.3 stable release

    Conflicts:
    drivers/cpufreq/imx-cpufreq-dt.c
    drivers/spi/spi-fsl-qspi.c

    The conflict is very minor, fixed it when do the merge. The imx-cpufreq-dt.c
    is just one line code-style change, using upstream one, no any function change.

    The spi-fsl-qspi.c has minor conflicts when merge upstream fixes: c69b17da53b2
    spi: spi-fsl-qspi: Clear TDH bits in FLSHCR register

    After merge, basic boot sanity test and basic qspi test been done on i.mx

    Signed-off-by: Jason Liu

    Jason Liu
     

13 Dec, 2019

1 commit

  • commit 163b00cde7cf2206e248789d2780121ad5e6a70b upstream.

    1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone
    device") changed cancel_delayed_work to cancel_delayed_work_sync to avoid
    a use-after-free issue. However, cancel_delayed_work_sync could be called
    insides the WQ causing deadlock.

    [54109.642398] c0 1162 kworker/u17:1 D 0 11030 2 0x00000000
    [54109.642437] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check
    [54109.642447] c0 1162 Call trace:
    [54109.642456] c0 1162 __switch_to+0x138/0x158
    [54109.642467] c0 1162 __schedule+0xba4/0x1434
    [54109.642480] c0 1162 schedule_timeout+0xa0/0xb28
    [54109.642492] c0 1162 wait_for_common+0x138/0x2e8
    [54109.642511] c0 1162 flush_work+0x348/0x40c
    [54109.642522] c0 1162 __cancel_work_timer+0x180/0x218
    [54109.642544] c0 1162 handle_thermal_trip+0x2c4/0x5a4
    [54109.642553] c0 1162 thermal_zone_device_update+0x1b4/0x25c
    [54109.642563] c0 1162 thermal_zone_device_check+0x18/0x24
    [54109.642574] c0 1162 process_one_work+0x3cc/0x69c
    [54109.642583] c0 1162 worker_thread+0x49c/0x7c0
    [54109.642593] c0 1162 kthread+0x17c/0x1b0
    [54109.642602] c0 1162 ret_from_fork+0x10/0x18
    [54109.643051] c0 1162 kworker/u17:2 D 0 16245 2 0x00000000
    [54109.643067] c0 1162 Workqueue: thermal_passive_wq thermal_zone_device_check
    [54109.643077] c0 1162 Call trace:
    [54109.643085] c0 1162 __switch_to+0x138/0x158
    [54109.643095] c0 1162 __schedule+0xba4/0x1434
    [54109.643104] c0 1162 schedule_timeout+0xa0/0xb28
    [54109.643114] c0 1162 wait_for_common+0x138/0x2e8
    [54109.643122] c0 1162 flush_work+0x348/0x40c
    [54109.643131] c0 1162 __cancel_work_timer+0x180/0x218
    [54109.643141] c0 1162 handle_thermal_trip+0x2c4/0x5a4
    [54109.643150] c0 1162 thermal_zone_device_update+0x1b4/0x25c
    [54109.643159] c0 1162 thermal_zone_device_check+0x18/0x24
    [54109.643167] c0 1162 process_one_work+0x3cc/0x69c
    [54109.643177] c0 1162 worker_thread+0x49c/0x7c0
    [54109.643186] c0 1162 kthread+0x17c/0x1b0
    [54109.643195] c0 1162 ret_from_fork+0x10/0x18
    [54109.644500] c0 1162 cat D 0 7766 1 0x00000001
    [54109.644515] c0 1162 Call trace:
    [54109.644524] c0 1162 __switch_to+0x138/0x158
    [54109.644536] c0 1162 __schedule+0xba4/0x1434
    [54109.644546] c0 1162 schedule_preempt_disabled+0x80/0xb0
    [54109.644555] c0 1162 __mutex_lock+0x3a8/0x7f0
    [54109.644563] c0 1162 __mutex_lock_slowpath+0x14/0x20
    [54109.644575] c0 1162 thermal_zone_get_temp+0x84/0x360
    [54109.644586] c0 1162 temp_show+0x30/0x78
    [54109.644609] c0 1162 dev_attr_show+0x5c/0xf0
    [54109.644628] c0 1162 sysfs_kf_seq_show+0xcc/0x1a4
    [54109.644636] c0 1162 kernfs_seq_show+0x48/0x88
    [54109.644656] c0 1162 seq_read+0x1f4/0x73c
    [54109.644664] c0 1162 kernfs_fop_read+0x84/0x318
    [54109.644683] c0 1162 __vfs_read+0x50/0x1bc
    [54109.644692] c0 1162 vfs_read+0xa4/0x140
    [54109.644701] c0 1162 SyS_read+0xbc/0x144
    [54109.644708] c0 1162 el0_svc_naked+0x34/0x38
    [54109.845800] c0 1162 D 720.000s 1->7766->7766 cat [panic]

    Fixes: 1851799e1d29 ("thermal: Fix use-after-free when unregistering thermal zone device")
    Cc: stable@vger.kernel.org
    Signed-off-by: Wei Wang
    Signed-off-by: Zhang Rui
    Signed-off-by: Greg Kroah-Hartman

    Wei Wang
     

25 Nov, 2019

11 commits


21 Oct, 2019

1 commit

  • Replace the CPU device PM QoS used for the management of min and max
    frequency constraints in cpufreq (and its users) with per-policy
    frequency QoS to avoid problems with cpufreq policies covering
    more then one CPU.

    Namely, a cpufreq driver is registered with the subsys interface
    which calls cpufreq_add_dev() for each CPU, starting from CPU0, so
    currently the PM QoS notifiers are added to the first CPU in the
    policy (i.e. CPU0 in the majority of cases).

    In turn, when the cpufreq driver is unregistered, the subsys interface
    doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0,
    and the PM QoS notifiers are only removed when cpufreq_remove_dev() is
    called for the last CPU in the policy, say CPUx, which as a rule is
    not CPU0 if the policy covers more than one CPU. Then, the PM QoS
    notifiers cannot be removed, because CPUx does not have them, and
    they are still there in the device PM QoS notifiers list of CPU0,
    which prevents new PM QoS notifiers from being registered for CPU0
    on the next attempt to register the cpufreq driver.

    The same issue occurs when the first CPU in the policy goes offline
    before unregistering the driver.

    After this change it does not matter which CPU is the policy CPU at
    the driver registration time and whether or not it is online all the
    time, because the frequency QoS is per policy and not per CPU.

    Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework")
    Reported-by: Dmitry Osipenko
    Tested-by: Dmitry Osipenko
    Reported-by: Sudeep Holla
    Tested-by: Sudeep Holla
    Diagnosed-by: Viresh Kumar
    Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
    Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Viresh Kumar

    Rafael J. Wysocki
     

30 Sep, 2019

1 commit

  • Pull thermal SoC updates from Eduardo Valentin:
    "This is a really small pull in the midst of a lot of pending patches.

    We are in the middle of restructuring how we are maintaining the
    thermal subsystem, as per discussion in our last LPC. For now, I am
    sending just some changes that were pending in my tree. Looking
    forward to get a more streamlined process in the next merge window"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
    thermal: db8500: Rewrite to be a pure OF sensor
    thermal: db8500: Use dev helper variable
    thermal: db8500: Finalize device tree conversion
    thermal: thermal_mmio: remove some dead code

    Linus Torvalds
     

28 Sep, 2019

1 commit

  • Pull thermal management updates from Zhang Rui:

    - Add Amit Kucheria as thermal subsystem Reviewer (Amit Kucheria)

    - Fix a use after free bug when unregistering thermal zone devices (Ido
    Schimmel)

    - Fix thermal core framework to use put_device() when device_register()
    fails (Yue Hu)

    - Enable intel_pch_thermal and MMIO RAPL support for Intel Icelake
    platform (Srinivas Pandruvada)

    - Add clock operations in qorip thermal driver, for some platforms with
    clock control like i.MX8MQ (Anson Huang)

    - A couple of trivial fixes and cleanups for thermal core and different
    soc thermal drivers (Amit Kucheria, Christophe JAILLET, Chuhong Yuan,
    Fuqian Huang, Kelsey Skunberg, Nathan Huckleberry, Rishi Gupta,
    Srinivas Kandagatla)

    * 'for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
    MAINTAINERS: Add Amit Kucheria as reviewer for thermal
    thermal: Add some error messages
    thermal: Fix use-after-free when unregistering thermal zone device
    thermal/drivers/core: Use put_device() if device_register() fails
    thermal_hwmon: Sanitize thermal_zone type
    thermal: intel: Use dev_get_drvdata
    thermal: intel: int3403: replace printk(KERN_WARN...) with pr_warn(...)
    thermal: intel: int340x_thermal: Remove unnecessary acpi_has_method() uses
    thermal: int340x: processor_thermal: Add Ice Lake support
    drivers: thermal: qcom: tsens: Fix memory leak from qfprom read
    thermal: tegra: Fix a typo
    thermal: rcar_gen3_thermal: Replace devm_add_action() followed by failure action with devm_add_action_or_reset()
    thermal: armada: Fix -Wshift-negative-value
    dt-bindings: thermal: qoriq: Add optional clocks property
    thermal: qoriq: Use __maybe_unused instead of #if CONFIG_PM_SLEEP
    thermal: qoriq: Use devm_platform_ioremap_resource() instead of of_iomap()
    thermal: qoriq: Fix error path of calling qoriq_tmu_register_tmu_zone fail
    thermal: qoriq: Add clock operations
    drivers: thermal: processor_thermal_device: Export sysfs interface for TCC offset

    Linus Torvalds
     

25 Sep, 2019

3 commits

  • This patch rewrites the DB8500 thermal sensor to be a
    pure OF sensor, so that it can be used with thermal zones
    defined in the device tree.

    This driver was initially merged before we had generic
    thermal zone device tree bindings, and now it gets
    modernized to the way we do things these days.

    The old driver depended on a set of trigger points
    provided in the device tree or platform data to
    interpolate the current temperature between trigger
    points depending on whether the trend was rising or
    falling. This was bad because the trigger points should
    be used for defining temperature zone policies and
    bind to cooling devices.

    As the PRCMU (power reset control management unit) can
    only issue IRQs when we pass temperature trigger points
    upward or downward We instead define a number of
    temperature points inside the driver ranging from
    15 to 100 degrees celsius. The effect is that when
    we register the device we quickly trigger 15, 20 ... up
    to the room temperature in succession and then we
    get continous event IRQs also under normal operating
    conditions, and the temperature of the system is now
    reported more accurately (+/- 2.5 degrees celsius)
    while in the past the first trigger point was at 70
    degrees and the average temperature was simply reported
    as 35 degrees celsius (between 70 degrees and 0) until
    we passed 70 degrees which didn't accurately represent
    the temperature of the system.

    As a result of dropping all the trigger points from the
    driver and reusing the core DT thermal zone management
    code we reduce the code footprint quite a bit.

    Cc: Vincent Guittot
    Suggested-by: Daniel Lezcano
    Signed-off-by: Linus Walleij
    Reviewed-by: Daniel Lezcano
    Signed-off-by: Eduardo Valentin

    Linus Walleij
     
  • The code gets easier to read like this.

    Cc: Vincent Guittot
    Reviewed-by: Daniel Lezcano
    Signed-off-by: Linus Walleij
    Signed-off-by: Eduardo Valentin

    Linus Walleij
     
  • At some point there was an attempt to convert the DB8500
    thermal sensor to device tree: a probe path was added
    and the device tree was augmented for the Snowball board.
    The switchover was never completed: instead the thermal
    devices came from from the PRCMU MFD device and the probe
    on the Snowball was confused as another set of configuration
    appeared from the device tree.

    Move over to a device-tree only approach, as we fixed up
    the device trees.

    Cc: Vincent Guittot
    Acked-by: Lee Jones
    Reviewed-by: Daniel Lezcano
    Signed-off-by: Linus Walleij
    Signed-off-by: Eduardo Valentin

    Linus Walleij
     

24 Sep, 2019

3 commits

  • Zhang Rui
     
  • When registering a thermal zone device, we currently return -EINVAL in
    four cases. This makes it a little hard to debug the real cause of the
    failure.

    Print some error messages to make it easier for developer to figure out
    what happened.

    Signed-off-by: Amit Kucheria
    Signed-off-by: Zhang Rui

    Amit Kucheria
     
  • thermal_zone_device_unregister() cancels the delayed work that polls the
    thermal zone, but it does not wait for it to finish. This is racy with
    respect to the freeing of the thermal zone device, which can result in a
    use-after-free [1].

    Fix this by waiting for the delayed work to finish before freeing the
    thermal zone device. Note that thermal_zone_device_set_polling() is
    never invoked from an atomic context, so it is safe to call
    cancel_delayed_work_sync() that can block.

    [1]
    [ +0.002221] ==================================================================
    [ +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0
    [ +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17

    [ +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701
    [ +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
    [ +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check
    [ +0.000012] Call Trace:
    [ +0.000021] dump_stack+0xa9/0x10e
    [ +0.000020] print_address_description.cold.2+0x9/0x25e
    [ +0.000018] __kasan_report.cold.3+0x78/0x9d
    [ +0.000016] kasan_report+0xe/0x20
    [ +0.000016] __mutex_lock+0x1076/0x11c0
    [ +0.000014] step_wise_throttle+0x72/0x150
    [ +0.000018] handle_thermal_trip+0x167/0x760
    [ +0.000019] thermal_zone_device_update+0x19e/0x5f0
    [ +0.000019] process_one_work+0x969/0x16f0
    [ +0.000017] worker_thread+0x91/0xc40
    [ +0.000014] kthread+0x33d/0x400
    [ +0.000015] ret_from_fork+0x3a/0x50

    [ +0.000020] Allocated by task 1:
    [ +0.000015] save_stack+0x19/0x80
    [ +0.000015] __kasan_kmalloc.constprop.4+0xc1/0xd0
    [ +0.000014] kmem_cache_alloc_trace+0x152/0x320
    [ +0.000015] thermal_zone_device_register+0x1b4/0x13a0
    [ +0.000015] mlxsw_thermal_init+0xc92/0x23d0
    [ +0.000014] __mlxsw_core_bus_device_register+0x659/0x11b0
    [ +0.000013] mlxsw_core_bus_device_register+0x3d/0x90
    [ +0.000013] mlxsw_pci_probe+0x355/0x4b0
    [ +0.000014] local_pci_probe+0xc3/0x150
    [ +0.000013] pci_device_probe+0x280/0x410
    [ +0.000013] really_probe+0x26a/0xbb0
    [ +0.000013] driver_probe_device+0x208/0x2e0
    [ +0.000013] device_driver_attach+0xfe/0x140
    [ +0.000013] __driver_attach+0x110/0x310
    [ +0.000013] bus_for_each_dev+0x14b/0x1d0
    [ +0.000013] driver_register+0x1c0/0x400
    [ +0.000015] mlxsw_sp_module_init+0x5d/0xd3
    [ +0.000014] do_one_initcall+0x239/0x4dd
    [ +0.000013] kernel_init_freeable+0x42b/0x4e8
    [ +0.000012] kernel_init+0x11/0x18b
    [ +0.000013] ret_from_fork+0x3a/0x50

    [ +0.000015] Freed by task 581:
    [ +0.000013] save_stack+0x19/0x80
    [ +0.000014] __kasan_slab_free+0x125/0x170
    [ +0.000013] kfree+0xf3/0x310
    [ +0.000013] thermal_release+0xc7/0xf0
    [ +0.000014] device_release+0x77/0x200
    [ +0.000014] kobject_put+0x1a8/0x4c0
    [ +0.000014] device_unregister+0x38/0xc0
    [ +0.000014] thermal_zone_device_unregister+0x54e/0x6a0
    [ +0.000014] mlxsw_thermal_fini+0x184/0x35a
    [ +0.000014] mlxsw_core_bus_device_unregister+0x10a/0x640
    [ +0.000013] mlxsw_devlink_core_bus_device_reload+0x92/0x210
    [ +0.000015] devlink_nl_cmd_reload+0x113/0x1f0
    [ +0.000014] genl_family_rcv_msg+0x700/0xee0
    [ +0.000013] genl_rcv_msg+0xca/0x170
    [ +0.000013] netlink_rcv_skb+0x137/0x3a0
    [ +0.000012] genl_rcv+0x29/0x40
    [ +0.000013] netlink_unicast+0x49b/0x660
    [ +0.000013] netlink_sendmsg+0x755/0xc90
    [ +0.000013] __sys_sendto+0x3de/0x430
    [ +0.000013] __x64_sys_sendto+0xe2/0x1b0
    [ +0.000013] do_syscall_64+0xa4/0x4d0
    [ +0.000013] entry_SYSCALL_64_after_hwframe+0x49/0xbe

    [ +0.000017] The buggy address belongs to the object at ffff8881e48e0008
    which belongs to the cache kmalloc-2k of size 2048
    [ +0.000012] The buggy address is located 1096 bytes inside of
    2048-byte region [ffff8881e48e0008, ffff8881e48e0808)
    [ +0.000007] The buggy address belongs to the page:
    [ +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0
    [ +0.000020] flags: 0x200000000010200(slab|head)
    [ +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0
    [ +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
    [ +0.000007] page dumped because: kasan: bad access detected

    [ +0.000012] Memory state around the buggy address:
    [ +0.000012] ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ +0.000012] ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ +0.000012] >ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ +0.000008] ^
    [ +0.000012] ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ +0.000012] ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [ +0.000007] ==================================================================

    Fixes: b1569e99c795 ("ACPI: move thermal trip handling to generic thermal layer")
    Reported-by: Jiri Pirko
    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: Zhang Rui

    Ido Schimmel