Eric Lee / smarc-fsl-linux-kernel

05 Aug, 2020

25 commits

ea559138b mlxsw: core: Free EMAD transactions using kfree_rcu() ... Browse Code »

[ Upstream commit 3c8ce24b037648a5a15b85888b259a74b05ff97d ]

The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is
managed using RCU. They are freed using kfree_rcu() once the transaction
ends.

However, in case the transaction failed it is freed immediately after being
removed from the active transactions list. This is problematic because it is
still possible for a different CPU to dereference the transaction from an RCU
read-side critical section while traversing the active transaction list in
mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered
[1].

Fix this by freeing the transaction after a grace period by calling
kfree_rcu().

[1]
BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881

CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ #44
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:

__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xf6/0x16e lib/dump_stack.c:118
print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061
mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
__do_softirq+0x223/0x964 kernel/softirq.c:292
asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711

__run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77
invoke_softirq kernel/softirq.c:387 [inline]
__irq_exit_rcu kernel/softirq.c:417 [inline]
irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429
sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191
Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d
RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286
RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9
RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1
R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000
__do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874
do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x473dbd
Code: Bad RIP value.
RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040
RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c
R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80

Allocated by task 871:
save_stack+0x1b/0x40 mm/kasan/common.c:48
set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc mm/kasan/common.c:494 [inline]
__kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
kmalloc include/linux/slab.h:555 [inline]
kzalloc include/linux/slab.h:669 [inline]
mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812
mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
kthread+0x355/0x470 kernel/kthread.c:291
ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

Freed by task 871:
save_stack+0x1b/0x40 mm/kasan/common.c:48
set_track mm/kasan/common.c:56 [inline]
kasan_set_free_info mm/kasan/common.c:316 [inline]
__kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
slab_free_hook mm/slub.c:1474 [inline]
slab_free_freelist_hook mm/slub.c:1507 [inline]
slab_free mm/slub.c:3072 [inline]
kfree+0xe6/0x320 mm/slub.c:4052
mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819
mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
kthread+0x355/0x470 kernel/kthread.c:291
ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88800b796400
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 232 bytes inside of
512-byte region [ffff88800b796400, ffff88800b796600)
The buggy address belongs to the page:
page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: caf7297e7ab5 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Ido Schimmel
2020-08-05 15:59:48 +0800
57f498ced mlxsw: core: Increase scope of RCU read-side critical section ... Browse Code »

[ Upstream commit 7d8e8f3433dc8d1dc87c1aabe73a154978fb4c4d ]

The lifetime of the Rx listener item ('rxl_item') is managed using RCU,
but is dereferenced outside of RCU read-side critical section, which can
lead to a use-after-free.

Fix this by increasing the scope of the RCU read-side critical section.

Fixes: 93c1edb27f9e ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Ido Schimmel
2020-08-05 15:59:47 +0800
0f424eda4 mlx4: disable device on shutdown ... Browse Code »

[ Upstream commit 3cab8c65525920f00d8f4997b3e9bb73aecb3a8e ]

It appears that not disabling a PCI device on .shutdown may lead to
a Hardware Error with particular (perhaps buggy) BIOS versions:

mlx4_en: eth0: Close port called
mlx4_en 0000:04:00.0: removed PHC
reboot: Restarting system
{1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
{1}[Hardware Error]: event severity: fatal
{1}[Hardware Error]: Error 0, type: fatal
{1}[Hardware Error]: section_type: PCIe error
{1}[Hardware Error]: port_type: 4, root port
{1}[Hardware Error]: version: 1.16
{1}[Hardware Error]: command: 0x4010, status: 0x0143
{1}[Hardware Error]: device_id: 0000:00:02.2
{1}[Hardware Error]: slot: 0
{1}[Hardware Error]: secondary_bus: 0x04
{1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2f06
{1}[Hardware Error]: class_code: 000604
{1}[Hardware Error]: bridge: secondary_status: 0x2000, control: 0x0003
{1}[Hardware Error]: aer_uncor_status: 0x00100000, aer_uncor_mask: 0x00000000
{1}[Hardware Error]: aer_uncor_severity: 0x00062030
{1}[Hardware Error]: TLP Header: 40000018 040000ff 791f4080 00000000
[hw error repeats]
Kernel panic - not syncing: Fatal hardware error!
CPU: 0 PID: 2189 Comm: reboot Kdump: loaded Not tainted 5.6.x-blabla #1
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 05/05/2017

Fix the mlx4 driver.

This is a very similar problem to what had been fixed in:
commit 0d98ba8d70b0 ("scsi: hpsa: disable device during shutdown")
to address https://bugzilla.kernel.org/show_bug.cgi?id=199779.

Fixes: 2ba5fbd62b25 ("net/mlx4_core: Handle AER flow properly")
Reported-by: Jake Lawrence
Signed-off-by: Jakub Kicinski
Reviewed-by: Saeed Mahameed
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Jakub Kicinski
2020-08-05 15:59:47 +0800
b1d629d32 net: lan78xx: fix transfer-buffer memory leak ... Browse Code »

[ Upstream commit 63634aa679ba8b5e306ad0727120309ae6ba8a8e ]

The interrupt URB transfer-buffer was never freed on disconnect or after
probe errors.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: Woojung.Huh@microchip.com
Signed-off-by: Johan Hovold
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Johan Hovold
2020-08-05 15:59:47 +0800
9db3040eb net: lan78xx: add missing endpoint sanity check ... Browse Code »

[ Upstream commit 8d8e95fd6d69d774013f51e5f2ee10c6e6d1fc14 ]

Add the missing endpoint sanity check to prevent a NULL-pointer
dereference should a malicious device lack the expected endpoints.

Note that the driver has a broken endpoint-lookup helper,
lan78xx_get_endpoints(), which can end up accepting interfaces in an
altsetting without endpoints as long as *some* altsetting has a bulk-in
and a bulk-out endpoint.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: Woojung.Huh@microchip.com
Signed-off-by: Johan Hovold
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Johan Hovold
2020-08-05 15:59:47 +0800
32ec4441c net/mlx5e: Fix kernel crash when setting vf VLANID on a VF dev ... Browse Code »

[ Upstream commit 350a63249d270b1f5bd05c7e2a24cd8de0f9db20 ]

After the cited commit, function 'mlx5_eswitch_set_vport_vlan' started
to acquire esw->state_lock.
However, esw is not defined for VF devices, hence attempting to set vf
VLANID on a VF dev will cause a kernel panic.

Fix it by moving up the (redundant) esw validation from function
'__mlx5_eswitch_set_vport_vlan' since the rest of the callers now have
and use a valid esw.

For example with vf device eth4:
# ip link set dev eth4 vf 0 vlan 0

Trace of the panic:
[ 411.409842] BUG: unable to handle page fault for address: 00000000000011b8
[ 411.449745] #PF: supervisor read access in kernel mode
[ 411.452348] #PF: error_code(0x0000) - not-present page
[ 411.454938] PGD 80000004189c9067 P4D 80000004189c9067 PUD 41899a067 PMD 0
[ 411.458382] Oops: 0000 [#1] SMP PTI
[ 411.460268] CPU: 4 PID: 5711 Comm: ip Not tainted 5.8.0-rc4_for_upstream_min_debug_2020_07_08_22_04 #1
[ 411.462447] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 411.464158] RIP: 0010:__mutex_lock+0x4e/0x940
[ 411.464928] Code: fd 41 54 49 89 f4 41 52 53 89 d3 48 83 ec 70 44 8b 1d ee 03 b0 01 65 48 8b 04 25 28 00 00 00 48 89 45 c8 31 c0 45 85 db 75 0a 3b 7f 60 0f 85 7e 05 00 00 49 8d 45 68 41 56 41 b8 01 00 00 00
[ 411.467678] RSP: 0018:ffff88841fcd74b0 EFLAGS: 00010246
[ 411.468562] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 411.469715] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000001158
[ 411.470812] RBP: ffff88841fcd7550 R08: ffffffffa00fa1ce R09: 0000000000000000
[ 411.471835] R10: ffff88841fcd7570 R11: 0000000000000000 R12: 0000000000000002
[ 411.472862] R13: 0000000000001158 R14: ffffffffa00fa1ce R15: 0000000000000000
[ 411.474004] FS: 00007faee7ca6b80(0000) GS:ffff88846fc00000(0000) knlGS:0000000000000000
[ 411.475237] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 411.476129] CR2: 00000000000011b8 CR3: 000000041909c006 CR4: 0000000000360ea0
[ 411.477260] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 411.478340] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 411.479332] Call Trace:
[ 411.479760] ? __nla_validate_parse.part.6+0x57/0x8f0
[ 411.482825] ? mlx5_eswitch_set_vport_vlan+0x3e/0xa0 [mlx5_core]
[ 411.483804] mlx5_eswitch_set_vport_vlan+0x3e/0xa0 [mlx5_core]
[ 411.484733] mlx5e_set_vf_vlan+0x41/0x50 [mlx5_core]
[ 411.485545] do_setlink+0x613/0x1000
[ 411.486165] __rtnl_newlink+0x53d/0x8c0
[ 411.486791] ? mark_held_locks+0x49/0x70
[ 411.487429] ? __lock_acquire+0x8fe/0x1eb0
[ 411.488085] ? rcu_read_lock_sched_held+0x52/0x60
[ 411.488998] ? kmem_cache_alloc_trace+0x16d/0x2d0
[ 411.489759] rtnl_newlink+0x47/0x70
[ 411.490357] rtnetlink_rcv_msg+0x24e/0x450
[ 411.490978] ? netlink_deliver_tap+0x92/0x3d0
[ 411.491631] ? validate_linkmsg+0x330/0x330
[ 411.492262] netlink_rcv_skb+0x47/0x110
[ 411.492852] netlink_unicast+0x1ac/0x270
[ 411.493551] netlink_sendmsg+0x336/0x450
[ 411.494209] sock_sendmsg+0x30/0x40
[ 411.494779] ____sys_sendmsg+0x1dd/0x1f0
[ 411.495378] ? copy_msghdr_from_user+0x5c/0x90
[ 411.496082] ___sys_sendmsg+0x87/0xd0
[ 411.496683] ? lock_acquire+0xb9/0x3a0
[ 411.497322] ? lru_cache_add+0x5/0x170
[ 411.497944] ? find_held_lock+0x2d/0x90
[ 411.498568] ? handle_mm_fault+0xe46/0x18c0
[ 411.499205] ? __sys_sendmsg+0x51/0x90
[ 411.499784] __sys_sendmsg+0x51/0x90
[ 411.500341] do_syscall_64+0x59/0x2e0
[ 411.500938] ? asm_exc_page_fault+0x8/0x30
[ 411.501609] ? rcu_read_lock_sched_held+0x52/0x60
[ 411.502350] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 411.503093] RIP: 0033:0x7faee73b85a7
[ 411.503654] Code: Bad RIP value.

Fixes: 0e18134f4f9f ("net/mlx5e: Eswitch, use state_lock to synchronize vlan change")
Signed-off-by: Alaa Hleihel
Reviewed-by: Roi Dayan
Reviewed-by: Vlad Buslov
Signed-off-by: Saeed Mahameed
Signed-off-by: Sasha Levin

Alaa Hleihel
2020-08-05 15:59:47 +0800
475cbcef4 net/mlx5e: Modify uplink state on interface up/down ... Browse Code »

[ Upstream commit 7d0314b11cdd92bca8b89684c06953bf114605fc ]

When setting the PF interface up/down, notify the firmware to update
uplink state via MODIFY_VPORT_STATE, when E-Switch is enabled.

This behavior will prevent sending traffic out on uplink port when PF is
down, such as sending traffic from a VF interface which is still up.
Currently when calling mlx5e_open/close(), the driver only sends PAOS
command to notify the firmware to set the physical port state to
up/down, however, it is not sufficient. When VF is in "auto" state, it
follows the uplink state, which was not updated on mlx5e_open/close()
before this patch.

When switchdev mode is enabled and uplink representor is first enabled,
set the uplink port state value back to its FW default "AUTO".

Fixes: 63bfd399de55 ("net/mlx5e: Send PAOS command on interface up/down")
Signed-off-by: Ron Diskin
Reviewed-by: Roi Dayan
Reviewed-by: Moshe Shemesh
Signed-off-by: Saeed Mahameed
Signed-off-by: Sasha Levin

Ron Diskin
2020-08-05 15:59:47 +0800
43608372b net/mlx5: Verify Hardware supports requested ptp function on a given pin ... Browse Code »

[ Upstream commit 071995c877a8646209d55ff8edddd2b054e7424c ]

Fix a bug where driver did not verify Hardware pin capabilities for
PTP functions.

Fixes: ee7f12205abc ("net/mlx5e: Implement 1PPS support")
Signed-off-by: Eran Ben Elisha
Reviewed-by: Ariel Levkovich
Signed-off-by: Saeed Mahameed
Signed-off-by: Sasha Levin

Eran Ben Elisha
2020-08-05 15:59:46 +0800
8901896f6 net/mlx5e: Fix error path of device attach ... Browse Code »

[ Upstream commit 5cd39b6e9a420329a9a408894be7ba8aa7dd755e ]

On failure to attach the netdev, fix the rollback by re-setting the
device's state back to MLX5E_STATE_DESTROYING.

Failing to attach doesn't stop statistics polling via .ndo_get_stats64.
In this case, although the device is not attached, it falsely continues
to query the firmware for counters. Setting the device's state back to
MLX5E_STATE_DESTROYING prevents the firmware counters query.

Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Aya Levin
Reviewed-by: Tariq Toukan
Signed-off-by: Saeed Mahameed
Signed-off-by: Sasha Levin

Aya Levin
2020-08-05 15:59:46 +0800
00bedd730 net/mlx5: E-switch, Destroy TSAR when fail to enable the mode ... Browse Code »

[ Upstream commit 2b8e9c7c3fd0e31091edb1c66cc06ffe4988ca21 ]

When either esw_legacy_enable() or esw_offloads_enable() fails,
code missed to destroy the created TSAR.

Hence, add the missing call to destroy the TSAR.

Fixes: 610090ebce92 ("net/mlx5: E-switch, Initialize TSAR Qos hardware block before its user vports")
Signed-off-by: Parav Pandit
Reviewed-by: Roi Dayan
Signed-off-by: Saeed Mahameed
Signed-off-by: Sasha Levin

Parav Pandit
2020-08-05 15:59:46 +0800
d70f9a3cc net: hns3: fix aRFS FD rules leftover after add a user FD rule ... Browse Code »

[ Upstream commit efe3fa45f770f1d66e2734ee7a3523c75694ff04 ]

When user had created a FD rule, all the aRFS rules should be clear up.
HNS3 process flow as below:
1.get spin lock of fd_ruls_list
2.clear up all aRFS rules
3.release lock
4.get spin lock of fd_ruls_list
5.creat a rules
6.release lock;

There is a short period of time between step 3 and step 4, which would
creatting some new aRFS FD rules if driver was receiving packet.
So refactor the fd_rule_lock to fix it.

Fixes: 441228875706 ("net: hns3: refine the flow director handle")
Signed-off-by: Guojia Liao
Signed-off-by: Huazhong Tan
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Guojia Liao
2020-08-05 15:59:46 +0800
475b8d619 net: hns3: fix a TX timeout issue ... Browse Code »

[ Upstream commit a7e90ee5965fafc53d36e8b3205f08c88d7bc11f ]

When the queue depth and queue parameters are modified, there is
a low probability that TX timeout occurs. The two operations cause
the link to be down or up when the watchdog is still working. All
queues are stopped when the link is down. After the carrier is on,
all queues are woken up. If the watchdog detects the link between
the carrier on and wakeup queues, a false TX timeout occurs.

So fix this issue by modifying the sequence of carrier on and queue
wakeup, which is symmetrical to the link down action.

Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yonglong Liu
Signed-off-by: Huazhong Tan
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin

Yonglong Liu
2020-08-05 15:59:46 +0800
831c904a0 nvme-tcp: fix possible hang waiting for icresp response ... Browse Code »

[ Upstream commit adc99fd378398f4c58798a1c57889872967d56a6 ]

If the controller died exactly when we are receiving icresp
we hang because icresp may never return. Make sure to set a
high finite limit.

Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Sagi Grimberg
Signed-off-by: Christoph Hellwig
Signed-off-by: Sasha Levin

Sagi Grimberg
2020-08-05 15:59:45 +0800
6a9428427 drm: hold gem reference until object is no longer accessed ... Browse Code »

commit 8490d6a7e0a0a6fab5c2d82d57a3937306660864 upstream.

A use-after-free in drm_gem_open_ioctl can happen if the
GEM object handle is closed between the idr lookup and
retrieving the size from said object since a local reference
is not being held at that point. Hold the local reference
while the object can still be accessed to fix this and
plug the potential security hole.

Signed-off-by: Steve Cohen
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter
Link: https://patchwork.freedesktop.org/patch/msgid/1595284250-31580-1-git-send-email-cohens@codeaurora.org
Signed-off-by: Greg Kroah-Hartman

Steve Cohen
2020-08-05 15:59:44 +0800
7eef3b463 drm/dbi: Fix SPI Type 1 (9-bit) transfer ... Browse Code »

commit 900ab59e2621053b009f707f80b2c19ce0af5dee upstream.

The function mipi_dbi_spi1_transfer() will transfer its payload as 9-bit
data, the 9th (MSB) bit being the data/command bit. In order to do that,
it unpacks the 8-bit values into 16-bit values, then sets the 9th bit if
the byte corresponds to data, clears it otherwise. The 7 MSB are
padding. The array of now 16-bit values is then passed to the SPI core
for transfer.

This function was broken since its introduction, as the length of the
SPI transfer was set to the payload size before its conversion, but the
payload doubled in size due to the 8-bit -> 16-bit conversion.

Fixes: 02dd95fe3169 ("drm/tinydrm: Add MIPI DBI support")
Cc: # 5.4+
Signed-off-by: Paul Cercueil
Reviewed-by: Sam Ravnborg
Reviewed-by: Noralf Trønnes
Signed-off-by: Sam Ravnborg
Link: https://patchwork.freedesktop.org/patch/msgid/20200703141341.1266263-1-paul@crapouillou.net
Signed-off-by: Greg Kroah-Hartman

Paul Cercueil
2020-08-05 15:59:44 +0800
8ea180f1c drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() ... Browse Code »

commit 543e8669ed9bfb30545fd52bc0e047ca4df7fb31 upstream.

Compiler leaves a 4-byte hole near the end of `dev_info`, causing
amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace
when `size` is greater than 356.

In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which
unfortunately does not initialize that 4-byte hole. Fix it by using
memset() instead.

Cc: stable@vger.kernel.org
Fixes: c193fa91b918 ("drm/amdgpu: information leak in amdgpu_info_ioctl()")
Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
Suggested-by: Dan Carpenter
Reviewed-by: Christian König
Signed-off-by: Peilin Ye
Signed-off-by: Alex Deucher
Signed-off-by: Greg Kroah-Hartman

Peilin Ye
2020-08-05 15:59:43 +0800
f1b4bdde2 drm/amd/display: Clear dm_state for fast updates ... Browse Code »

commit fde9f39ac7f1ffd799a96ffa1e06b2051f0898f1 upstream.

This patch fixes a race condition that causes a use-after-free during
amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits
are requested and the second one finishes before the first. Essentially,
this bug occurs when the following sequence of events happens:

1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is
deferred to the workqueue.

2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is
deferred to the workqueue.

3. Commit #2 starts before commit #1, dm_state #1 is used in the
commit_tail and commit #2 completes, freeing dm_state #1.

4. Commit #1 starts after commit #2 completes, uses the freed dm_state
1 and dereferences a freelist pointer while setting the context.

Since this bug has only been spotted with fast commits, this patch fixes
the bug by clearing the dm_state instead of using the old dc_state for
fast updates. In addition, since dm_state is only used for its dc_state
and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found,
removing the dm_state should not have any consequences in fast updates.

This use-after-free bug has existed for a while now, but only caused a
noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate
freelist pointer to middle of object") moving the freelist pointer from
dm_state->base (which was unused) to dm_state->context (which is
dereferenced).

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383
Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates")
Reported-by: Duncan
Signed-off-by: Mazin Rezk
Reviewed-by: Nicholas Kazlauskas
Signed-off-by: Alex Deucher
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

Mazin Rezk
2020-08-05 15:59:43 +0800
22d3202e5 Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers" ... Browse Code »

commit 87004abfbc27261edd15716515d89ab42198b405 upstream.

This regressed some working configurations so revert it. Will
fix this properly for 5.9 and backport then.

This reverts commit 38e0c89a19fd13f28d2b4721035160a3e66e270b.

Signed-off-by: Alex Deucher
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

Alex Deucher
2020-08-05 15:59:43 +0800
cea6633d5 virtio_balloon: fix up endian-ness for free cmd id ... Browse Code »

commit 168c358af2f8c5a37f8b5f877ba2cc93995606ee upstream.

free cmd id is read using virtio endian, spec says all fields
in balloon are LE. Fix it up.

Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin
Acked-by: Jason Wang
Reviewed-by: Wei Wang
Acked-by: David Hildenbrand
Signed-off-by: Greg Kroah-Hartman

Michael S. Tsirkin
2020-08-05 15:59:43 +0800
96f105943 vhost/scsi: fix up req type endian-ness ... Browse Code »

commit 295c1b9852d000580786375304a9800bd9634d15 upstream.

vhost/scsi doesn't handle type conversion correctly
for request type when using virtio 1.0 and up for BE,
or cross-endian platforms.

Fix it up using vhost_32_to_cpu.

Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin
Acked-by: Jason Wang
Reviewed-by: Stefan Hajnoczi
Signed-off-by: Greg Kroah-Hartman

Michael S. Tsirkin
2020-08-05 15:59:42 +0800
951117a20 IB/rdmavt: Fix RQ counting issues causing use of an invalid RWQE ... Browse Code »

commit 54a485e9ec084da1a4b32dcf7749c7d760ed8aa5 upstream.

The lookaside count is improperly initialized to the size of the
Receive Queue with the additional +1. In the traces below, the
RQ size is 384, so the count was set to 385.

The lookaside count is then rarely refreshed. Note the high and
incorrect count in the trace below:

rvt_get_rwqe: [hfi1_0] wqe ffffc900078e9008 wr_id 55c7206d75a0 qpn c
qpt 2 pid 3018 num_sge 1 head 1 tail 0, count 385
rvt_get_rwqe: (hfi1_rc_rcv+0x4eb/0x1480 [hfi1]
Cc: # 5.4.x
Reviewed-by: Kaike Wan
Signed-off-by: Mike Marciniszyn
Tested-by: Honggang Li
Signed-off-by: Jason Gunthorpe
Signed-off-by: Greg Kroah-Hartman

Mike Marciniszyn
2020-08-05 15:59:42 +0800
140210554 PCI/ASPM: Disable ASPM on ASMedia ASM1083/1085 PCIe-to-PCI bridge ... Browse Code »

commit b361663c5a40c8bc758b7f7f2239f7a192180e7c upstream.

Recently ASPM handling was changed to allow ASPM on PCIe-to-PCI/PCI-X
bridges. Unfortunately the ASMedia ASM1083/1085 PCIe to PCI bridge device
doesn't seem to function properly with ASPM enabled. On an Asus PRIME
H270-PRO motherboard, it causes errors like these:

pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID)
pcieport 0000:00:1c.0: AER: device [8086:a292] error status/mask=00003000/00002000
pcieport 0000:00:1c.0: AER: [12] Timeout
pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0
pcieport 0000:00:1c.0: AER: can't find device of ID00e0

In addition to flooding the kernel log, this also causes the machine to
wake up immediately after suspend is initiated.

The device advertises ASPM L0s and L1 support in the Link Capabilities
register, but the ASMedia web page for ASM1083 [1] claims "No PCIe ASPM
support".

Windows 10 (build 2004) enables L0s, but it also logs correctable PCIe
errors.

Add a quirk to disable ASPM for this device.

[1] https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114

[bhelgaas: commit log]
Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208667
Link: https://lore.kernel.org/r/20200722021803.17958-1-hancockrwd@gmail.com
Signed-off-by: Robert Hancock
Signed-off-by: Bjorn Helgaas
Signed-off-by: Greg Kroah-Hartman

Robert Hancock
2020-08-05 15:59:41 +0800
2ff65580d ath10k: enable transmit data ack RSSI for QCA9884 ... Browse Code »

commit cc78dc3b790619aa05f22a86a9152986bd73698c upstream.

For all data packets transmitted, host gets htt tx completion event. Some QCA9984
firmware releases support WMI_SERVICE_TX_DATA_ACK_RSSI, which gives data
ack rssi values to host through htt event of data tx completion. Data ack rssi
values are valid if A0 bit is set in HTT rx message. So enable the feature also
for QCA9884.

Tested HW: QCA9984
Tested FW: 10.4-3.9.0.2-00044

Signed-off-by: Abhishek Ambure
Signed-off-by: Balaji Pothunoori
[kvalo@codeaurora.org: improve commit log]
Signed-off-by: Kalle Valo
Signed-off-by: Sathishkumar Muruganandam
Signed-off-by: Greg Kroah-Hartman

Abhishek Ambure
2020-08-05 15:59:41 +0800
84da97713 media: rc: prevent memory leak in cx23888_ir_probe ... Browse Code »

[ Upstream commit a7b2df76b42bdd026e3106cf2ba97db41345a177 ]

In cx23888_ir_probe if kfifo_alloc fails the allocated memory for state
should be released.

Signed-off-by: Navid Emamdoost
Signed-off-by: Sean Young
Signed-off-by: Mauro Carvalho Chehab
Signed-off-by: Sasha Levin

Navid Emamdoost
2020-08-05 15:59:40 +0800
ecfa7fa19 crypto: ccp - Release all allocated memory if sha type is invalid ... Browse Code »

[ Upstream commit 128c66429247add5128c03dc1e144ca56f05a4e2 ]

Release all allocated memory if sha type is invalid:
In ccp_run_sha_cmd, if the type of sha is invalid, the allocated
hmac_buf should be released.

v2: fix the goto.

Signed-off-by: Navid Emamdoost
Acked-by: Gary R Hook
Signed-off-by: Herbert Xu
Signed-off-by: Sasha Levin

Navid Emamdoost
2020-08-05 15:59:40 +0800

01 Aug, 2020

4 commits

909dbf09c Revert "dpaa_eth: fix usage as DSA master, try 3" ... Browse Code »

This reverts commit 40a904b1c2e57b22dd002dfce73688871cb0bac8.

The patch is not wrong, but the Fixes: tag is. It should have been:

Fixes: 060ad66f9795 ("dpaa_eth: change DMA device")

which means that it's fixing a commit which was introduced in:

git tag --contains 060ad66f97954
v5.5

which then means it should have not been backported to linux-5.4.y,
where things _were_ working and now they're not.

Reported-by: Joakim Tjernlund
Signed-off-by: Vladimir Oltean
Signed-off-by: Greg Kroah-Hartman

Vladimir Oltean
2020-08-01 00:39:32 +0800
4918285a6 PM: wakeup: Show statistics for deleted wakeup sources again ... Browse Code »

commit e976eb4b91e906f20ec25b20c152d53c472fc3fd upstream.

After commit 00ee22c28915 (PM / wakeup: Use seq_open() to show wakeup
stats), print_wakeup_source_stats(m, &deleted_ws) is not called from
wakeup_sources_stats_seq_show() any more.

Because deleted_ws is one of the wakeup sources, it should be shown
too, so add it to the end of all other wakeup sources.

Signed-off-by: zhuguangqing
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Florian Fainelli
Signed-off-by: Greg Kroah-Hartman

zhuguangqing
2020-08-01 00:39:32 +0800
59242fa1d regmap: debugfs: check count when read regmap file ... Browse Code »

commit 74edd08a4fbf51d65fd8f4c7d8289cd0f392bd91 upstream.

When executing the following command, we met kernel dump.
dmesg -c > /dev/null; cd /sys;
for i in `ls /sys/kernel/debug/regmap/* -d`; do
echo "Checking regmap in $i";
cat $i/registers;
done && grep -ri "0x02d0" *;

It is because the count value is too big, and kmalloc fails. So add an
upper bound check to allow max size `PAGE_SIZE << (MAX_ORDER - 1)`.

Signed-off-by: Peng Fan
Link: https://lore.kernel.org/r/1584064687-12964-1-git-send-email-peng.fan@nxp.com
Signed-off-by: Mark Brown
Signed-off-by: Greg Kroah-Hartman

Peng Fan
2020-08-01 00:39:32 +0800
fbcd85cd1 drivers/net/wan/x25_asy: Fix to make it work ... Browse Code »

[ Upstream commit 8fdcabeac39824fe67480fd9508d80161c541854 ]

This driver is not working because of problems of its receiving code.
This patch fixes it to make it work.

When the driver receives an LAPB frame, it should first pass the frame
to the LAPB module to process. After processing, the LAPB module passes
the data (the packet) back to the driver, the driver should then add a
one-byte pseudo header and pass the data to upper layers.

The changes to the "x25_asy_bump" function and the
"x25_asy_data_indication" function are to correctly implement this
procedure.

Also, the "x25_asy_unesc" function ignores any frame that is shorter
than 3 bytes. However the shortest frames are 2-byte long. So we need
to change it to allow 2-byte frames to pass.

Cc: Eric Dumazet
Cc: Martin Schiller
Signed-off-by: Xie He
Reviewed-by: Martin Schiller
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Xie He
2020-08-01 00:39:30 +0800

29 Jul, 2020

11 commits

c15d59b94 ath9k: Fix regression with Atheros 9271 ... Browse Code »

commit 92f53e2fda8bb9a559ad61d57bfb397ce67ed0ab upstream.

This fix allows ath9k_htc modules to connect to WLAN once again.

Fixes: 2bbcaaee1fcb ("ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208251
Signed-off-by: Mark O'Donovan
Reported-by: Roman Mamedov
Tested-by: Viktor Jägersküpper
Signed-off-by: Kalle Valo
Link: https://lore.kernel.org/r/20200711043324.8079-1-shiftee@posteo.net
Signed-off-by: Greg Kroah-Hartman

Mark O'Donovan
2020-07-29 16:18:46 +0800
e6eb815be ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb ... Browse Code »

commit 2bbcaaee1fcbd83272e29f31e2bb7e70d8c49e05 upstream.

In ath9k_hif_usb_rx_cb interface number is assumed to be 0.
usb_ifnum_to_if(urb->dev, 0)
But it isn't always true.

The case reported by syzbot:
https://lore.kernel.org/linux-usb/000000000000666c9c05a1c05d12@google.com
usb 2-1: new high-speed USB device number 2 using dummy_hcd
usb 2-1: config 1 has an invalid interface number: 2 but max is 0
usb 2-1: config 1 has no interface number 0
usb 2-1: New USB device found, idVendor=0cf3, idProduct=9271, bcdDevice=
1.08
usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
general protection fault, probably for non-canonical address
0xdffffc0000000015: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc5-syzkaller #0

Call Trace
__usb_hcd_giveback_urb+0x29a/0x550 drivers/usb/core/hcd.c:1650
usb_hcd_giveback_urb+0x368/0x420 drivers/usb/core/hcd.c:1716
dummy_timer+0x1258/0x32ae drivers/usb/gadget/udc/dummy_hcd.c:1966
call_timer_fn+0x195/0x6f0 kernel/time/timer.c:1404
expire_timers kernel/time/timer.c:1449 [inline]
__run_timers kernel/time/timer.c:1773 [inline]
__run_timers kernel/time/timer.c:1740 [inline]
run_timer_softirq+0x5f9/0x1500 kernel/time/timer.c:1786
__do_softirq+0x21e/0x950 kernel/softirq.c:292
invoke_softirq kernel/softirq.c:373 [inline]
irq_exit+0x178/0x1a0 kernel/softirq.c:413
exiting_irq arch/x86/include/asm/apic.h:546 [inline]
smp_apic_timer_interrupt+0x141/0x540 arch/x86/kernel/apic/apic.c:1146
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829

Reported-and-tested-by: syzbot+40d5d2e8a4680952f042@syzkaller.appspotmail.com
Signed-off-by: Qiujun Huang
Signed-off-by: Kalle Valo
Link: https://lore.kernel.org/r/20200404041838.10426-6-hqjagain@gmail.com
Cc: Viktor Jägersküpper
Signed-off-by: Greg Kroah-Hartman

Qiujun Huang
2020-07-29 16:18:46 +0800
6d4448ca5 dm integrity: fix integrity recalculation that is improperly skipped ... Browse Code »

commit 5df96f2b9f58a5d2dc1f30fe7de75e197f2c25f2 upstream.

Commit adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 ("dm: report suspended
device during destroy") broke integrity recalculation.

The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().

Signed-off-by: Mikulas Patocka
Fixes: adc0daad366b ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Mikulas Patocka
2020-07-29 16:18:45 +0800
d1bab3cf7 drm/amd/powerplay: fix a crash when overclocking Vega M ... Browse Code »

commit 88bb16ad998a0395fe4b346b7d3f621aaa0a2324 upstream.

Avoid kernel crash when vddci_control is SMU7_VOLTAGE_CONTROL_NONE and
vddci_voltage_table is empty. It has been tested on Intel Hades Canyon
(i7-8809G).

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208489
Fixes: ac7822b0026f ("drm/amd/powerplay: add smumgr support for VEGAM (v2)")
Reviewed-by: Evan Quan
Signed-off-by: Qiu Wenbo
Signed-off-by: Alex Deucher
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

Qiu Wenbo
2020-07-29 16:18:44 +0800
33ab3f2dc drm/amdgpu: Fix NULL dereference in dpm sysfs handlers ... Browse Code »

commit 38e0c89a19fd13f28d2b4721035160a3e66e270b upstream.

NULL dereference occurs when string that is not ended with space or
newline is written to some dpm sysfs interface (for example pp_dpm_sclk).
This happens because strsep replaces the tmp with NULL if the delimiter
is not present in string, which is then dereferenced by tmp[0].

Reproduction example:
sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk'

Signed-off-by: Paweł Gronowski
Signed-off-by: Alex Deucher
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

Paweł Gronowski
2020-07-29 16:18:44 +0800
c3de96686 mmc: sdhci-of-aspeed: Fix clock divider calculation ... Browse Code »

commit ebd4050c6144b38098d8eed34df461e5e3fa82a9 upstream.

When calculating the clock divider, start dividing at 2 instead of 1.
The divider is divided by two at the end of the calculation, so starting
at 1 may result in a divider of 0, which shouldn't happen.

Signed-off-by: Eddie James
Reviewed-by: Andrew Jeffery
Acked-by: Joel Stanley
Acked-by: Adrian Hunter
Link: https://lore.kernel.org/r/20200709195706.12741-3-eajames@linux.ibm.com
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Ulf Hansson
Signed-off-by: Greg Kroah-Hartman

Eddie James
2020-07-29 16:18:44 +0800
23e8b741c vt: Reject zero-sized screen buffer size. ... Browse Code »

commit ce684552a266cb1c7cc2f7e623f38567adec6653 upstream.

syzbot is reporting general protection fault in do_con_write() [1] caused
by vc->vc_screenbuf == ZERO_SIZE_PTR caused by vc->vc_screenbuf_size == 0
caused by vc->vc_cols == vc->vc_rows == vc->vc_size_row == 0 caused by
fb_set_var() from ioctl(FBIOPUT_VSCREENINFO) on /dev/fb0 , for
gotoxy(vc, 0, 0) from reset_terminal() from vc_init() from vc_allocate()
from con_install() from tty_init_dev() from tty_open() on such console
causes vc->vc_pos == 0x10000000e due to
((unsigned long) ZERO_SIZE_PTR) + -1U * 0 + (-1U << 1).

I don't think that a console with 0 column or 0 row makes sense. And it
seems that vc_do_resize() does not intend to allow resizing a console to
0 column or 0 row due to

new_cols = (cols ? cols : vc->vc_cols);
new_rows = (lines ? lines : vc->vc_rows);

exception.

Theoretically, cols and rows can be any range as long as
0 < cols * rows * 2 vc_size_row = vc->vc_cols << 1;
vc->vc_screenbuf_size = vc->vc_rows * vc->vc_size_row;

in visual_init() and kzalloc(vc->vc_screenbuf_size) in vc_allocate().

Since we can detect cols == 0 or rows == 0 via screenbuf_size = 0 in
visual_init(), we can reject kzalloc(0). Then, vc_allocate() will return
an error, and con_write() will not be called on a console with 0 column
or 0 row.

We need to make sure that integer overflow in visual_init() won't happen.
Since vc_do_resize() restricts cols
Signed-off-by: Tetsuo Handa
Cc: stable
Link: https://lore.kernel.org/r/20200712111013.11881-1-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman

Tetsuo Handa
2020-07-29 16:18:43 +0800
028b478f2 fbdev: Detect integer underflow at "struct fbcon_ops"->clear_margins. ... Browse Code »

commit 033724d6864245a11f8e04c066002e6ad22b3fd0 upstream.

syzbot is reporting general protection fault in bitfill_aligned() [1]
caused by integer underflow in bit_clear_margins(). The cause of this
problem is when and how do_vc_resize() updates vc->vc_{cols,rows}.

If vc_do_resize() fails (e.g. kzalloc() fails) when var.xres or var.yres
is going to shrink, vc->vc_{cols,rows} will not be updated. This allows
bit_clear_margins() to see info->var.xres < (vc->vc_cols * cw) or
info->var.yres < (vc->vc_rows * ch). Unexpectedly large rw or bh will
try to overrun the __iomem region and causes general protection fault.

Also, vc_resize(vc, 0, 0) does not set vc->vc_{cols,rows} = 0 due to

new_cols = (cols ? cols : vc->vc_cols);
new_rows = (lines ? lines : vc->vc_rows);

exception. Since cols and lines are calculated as

cols = FBCON_SWAP(ops->rotate, info->var.xres, info->var.yres);
rows = FBCON_SWAP(ops->rotate, info->var.yres, info->var.xres);
cols /= vc->vc_font.width;
rows /= vc->vc_font.height;
vc_resize(vc, cols, rows);

in fbcon_modechanged(), var.xres < vc->vc_font.width makes cols = 0
and var.yres < vc->vc_font.height makes rows = 0. This means that

const int fd = open("/dev/fb0", O_ACCMODE);
struct fb_var_screeninfo var = { };
ioctl(fd, FBIOGET_VSCREENINFO, &var);
var.xres = var.yres = 1;
ioctl(fd, FBIOPUT_VSCREENINFO, &var);

easily reproduces integer underflow bug explained above.

Of course, callers of vc_resize() are not handling vc_do_resize() failure
is bad. But we can't avoid vc_resize(vc, 0, 0) which returns 0. Therefore,
as a band-aid workaround, this patch checks integer underflow in
"struct fbcon_ops"->clear_margins call, assuming that
vc->vc_cols * vc->vc_font.width and vc->vc_rows * vc->vc_font.heigh do not
cause integer overflow.

[1] https://syzkaller.appspot.com/bug?id=a565882df74fa76f10d3a6fec4be31098dbb37c6

Reported-and-tested-by: syzbot
Signed-off-by: Tetsuo Handa
Acked-by: Daniel Vetter
Cc: stable
Link: https://lore.kernel.org/r/20200715015102.3814-1-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman

Tetsuo Handa
2020-07-29 16:18:43 +0800
bf331efc8 /dev/mem: Add missing memory barriers for devmem_inode ... Browse Code »

commit b34e7e298d7a5ed76b3aa327c240c29f1ef6dd22 upstream.

WRITE_ONCE() isn't the correct way to publish a pointer to a data
structure, since it doesn't include a write memory barrier. Therefore
other tasks may see that the pointer has been set but not see that the
pointed-to memory has finished being initialized yet. Instead a
primitive with "release" semantics is needed.

Use smp_store_release() for this.

The use of READ_ONCE() on the read side is still potentially correct if
there's no control dependency, i.e. if all memory being "published" is
transitively reachable via the pointer itself. But this pairing is
somewhat confusing and error-prone. So just upgrade the read side to
smp_load_acquire() so that it clearly pairs with smp_store_release().

Cc: Arnd Bergmann
Cc: Ingo Molnar
Cc: Kees Cook
Cc: Matthew Wilcox
Cc: Russell King
Cc: Andrew Morton
Fixes: 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region")
Signed-off-by: Eric Biggers
Cc: stable
Acked-by: Dan Williams
Link: https://lore.kernel.org/r/20200716060553.24618-1-ebiggers@kernel.org
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2020-07-29 16:18:43 +0800
3c52751df serial: 8250_mtk: Fix high-speed baud rates clamping ... Browse Code »

commit 551e553f0d4ab623e2a6f424ab5834f9c7b5229c upstream.

Commit 7b668c064ec3 ("serial: 8250: Fix max baud limit in generic 8250
port") fixed limits of a baud rate setting for a generic 8250 port.
In other words since that commit the baud rate has been permitted to be
within [uartclk / 16 / UART_DIV_MAX; uartclk / 16], which is absolutely
normal for a standard 8250 UART port. But there are custom 8250 ports,
which provide extended baud rate limits. In particular the Mediatek 8250
port can work with baud rates up to "uartclk" speed.

Normally that and any other peculiarity is supposed to be handled in a
custom set_termios() callback implemented in the vendor-specific
8250-port glue-driver. Currently that is how it's done for the most of
the vendor-specific 8250 ports, but for some reason for Mediatek a
solution has been spread out to both the glue-driver and to the generic
8250-port code. Due to that a bug has been introduced, which permitted the
extended baud rate limit for all even for standard 8250-ports. The bug
has been fixed by the commit 7b668c064ec3 ("serial: 8250: Fix max baud
limit in generic 8250 port") by narrowing the baud rates limit back down to
the normal bounds. Unfortunately by doing so we also broke the
Mediatek-specific extended bauds feature.

A fix of the problem described above is twofold. First since we can't get
back the extended baud rate limits feature to the generic set_termios()
function and that method supports only a standard baud rates range, the
requested baud rate must be locally stored before calling it and then
restored back to the new termios structure after the generic set_termios()
finished its magic business. By doing so we still use the
serial8250_do_set_termios() method to set the LCR/MCR/FCR/etc. registers,
while the extended baud rate setting procedure will be performed later in
the custom Mediatek-specific set_termios() callback. Second since a true
baud rate is now fully calculated in the custom set_termios() method we
need to locally update the port timeout by calling the
uart_update_timeout() function. After the fixes described above are
implemented in the 8250_mtk.c driver, the Mediatek 8250-port should
get back to normally working with extended baud rates.

Link: https://lore.kernel.org/linux-serial/20200701211337.3027448-1-danielwinkler@google.com

Fixes: 7b668c064ec3 ("serial: 8250: Fix max baud limit in generic 8250 port")
Reported-by: Daniel Winkler
Signed-off-by: Serge Semin
Cc: stable
Tested-by: Claire Chang
Link: https://lore.kernel.org/r/20200714124113.20918-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Greg Kroah-Hartman

Serge Semin
2020-07-29 16:18:43 +0800
af811869d serial: 8250: fix null-ptr-deref in serial8250_start_tx() ... Browse Code »

commit f4c23a140d80ef5e6d3d1f8f57007649014b60fa upstream.

I got null-ptr-deref in serial8250_start_tx():

[ 78.114630] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 78.123778] Mem abort info:
[ 78.126560] ESR = 0x86000007
[ 78.129603] EC = 0x21: IABT (current EL), IL = 32 bits
[ 78.134891] SET = 0, FnV = 0
[ 78.137933] EA = 0, S1PTW = 0
[ 78.141064] user pgtable: 64k pages, 48-bit VAs, pgdp=00000027d41a8600
[ 78.147562] [0000000000000000] pgd=00000027893f0003, p4d=00000027893f0003, pud=00000027893f0003, pmd=00000027c9a20003, pte=0000000000000000
[ 78.160029] Internal error: Oops: 86000007 [#1] SMP
[ 78.164886] Modules linked in: sunrpc vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce ses enclosure sg sbsa_gwdt ipmi_ssif spi_dw_mmio sch_fq_codel vhost_net tun vhost vhost_iotlb tap ip_tables ext4 mbcache jbd2 ahci hisi_sas_v3_hw libahci hisi_sas_main libsas hns3 scsi_transport_sas hclge libata megaraid_sas ipmi_si hnae3 ipmi_devintf ipmi_msghandler br_netfilter bridge stp llc nvme nvme_core xt_sctp sctp libcrc32c dm_mod nbd
[ 78.207383] CPU: 11 PID: 23258 Comm: null-ptr Not tainted 5.8.0-rc6+ #48
[ 78.214056] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B210.01 03/12/2020
[ 78.222888] pstate: 80400089 (Nzcv daIf +PAN -UAO BTYPE=--)
[ 78.228435] pc : 0x0
[ 78.230618] lr : serial8250_start_tx+0x160/0x260
[ 78.235215] sp : ffff800062eefb80
[ 78.238517] x29: ffff800062eefb80 x28: 0000000000000fff
[ 78.243807] x27: ffff800062eefd80 x26: ffff202fd83b3000
[ 78.249098] x25: ffff800062eefd80 x24: ffff202fd83b3000
[ 78.254388] x23: ffff002fc5e50be8 x22: 0000000000000002
[ 78.259679] x21: 0000000000000001 x20: 0000000000000000
[ 78.264969] x19: ffffa688827eecc8 x18: 0000000000000000
[ 78.270259] x17: 0000000000000000 x16: 0000000000000000
[ 78.275550] x15: ffffa68881bc67a8 x14: 00000000000002e6
[ 78.280841] x13: ffffa68881bc67a8 x12: 000000000000c539
[ 78.286131] x11: d37a6f4de9bd37a7 x10: ffffa68881cccff0
[ 78.291421] x9 : ffffa68881bc6000 x8 : ffffa688819daa88
[ 78.296711] x7 : ffffa688822a0f20 x6 : ffffa688819e0000
[ 78.302002] x5 : ffff800062eef9d0 x4 : ffffa68881e707a8
[ 78.307292] x3 : 0000000000000000 x2 : 0000000000000002
[ 78.312582] x1 : 0000000000000001 x0 : ffffa688827eecc8
[ 78.317873] Call trace:
[ 78.320312] 0x0
[ 78.322147] __uart_start.isra.9+0x64/0x78
[ 78.326229] uart_start+0xb8/0x1c8
[ 78.329620] uart_flush_chars+0x24/0x30
[ 78.333442] n_tty_receive_buf_common+0x7b0/0xc30
[ 78.338128] n_tty_receive_buf+0x44/0x2c8
[ 78.342122] tty_ioctl+0x348/0x11f8
[ 78.345599] ksys_ioctl+0xd8/0xf8
[ 78.348903] __arm64_sys_ioctl+0x2c/0xc8
[ 78.352812] el0_svc_common.constprop.2+0x88/0x1b0
[ 78.357583] do_el0_svc+0x44/0xd0
[ 78.360887] el0_sync_handler+0x14c/0x1d0
[ 78.364880] el0_sync+0x140/0x180
[ 78.368185] Code: bad PC value

SERIAL_PORT_DFNS is not defined on each arch, if it's not defined,
serial8250_set_defaults() won't be called in serial8250_isa_init_ports(),
so the p->serial_in pointer won't be initialized, and it leads a null-ptr-deref.
Fix this problem by calling serial8250_set_defaults() after init uart port.

Signed-off-by: Yang Yingliang
Cc: stable
Link: https://lore.kernel.org/r/20200721143852.4058352-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman

Yang Yingliang
2020-07-29 16:18:42 +0800