Eric Lee / smarc-fsl-linux-kernel

18 Oct, 2018

3 commits

94402f236 bonding: avoid possible dead-lock ... Browse Code »

[ Upstream commit d4859d749aa7090ffb743d15648adb962a1baeae ]

Syzkaller reported this on a slightly older kernel but it's still
applicable to the current kernel -

======================================================
WARNING: possible circular locking dependency detected
4.18.0-next-20180823+ #46 Not tainted
------------------------------------------------------
syz-executor4/26841 is trying to acquire lock:
00000000dd41ef48 ((wq_completion)bond_dev->name){+.+.}, at: flush_workqueue+0x2db/0x1e10 kernel/workqueue.c:2652

but task is already holding lock:
00000000768ab431 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline]
00000000768ab431 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4708

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (rtnl_mutex){+.+.}:
__mutex_lock_common kernel/locking/mutex.c:925 [inline]
__mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088
rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77
bond_netdev_notify drivers/net/bonding/bond_main.c:1310 [inline]
bond_netdev_notify_work+0x44/0xd0 drivers/net/bonding/bond_main.c:1320
process_one_work+0xc73/0x1aa0 kernel/workqueue.c:2153
worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
kthread+0x35a/0x420 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

-> #1 ((work_completion)(&(&nnw->work)->work)){+.+.}:
process_one_work+0xc0b/0x1aa0 kernel/workqueue.c:2129
worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
kthread+0x35a/0x420 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

-> #0 ((wq_completion)bond_dev->name){+.+.}:
lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
__alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
register_netdevice+0x337/0x1100 net/core/dev.c:8410
bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:622 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:632
___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
__sys_sendmsg+0x11d/0x290 net/socket.c:2153
__do_sys_sendmsg net/socket.c:2162 [inline]
__se_sys_sendmsg net/socket.c:2160 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
(wq_completion)bond_dev->name --> (work_completion)(&(&nnw->work)->work) --> rtnl_mutex

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(rtnl_mutex);
lock((work_completion)(&(&nnw->work)->work));
lock(rtnl_mutex);
lock((wq_completion)bond_dev->name);

*** DEADLOCK ***

1 lock held by syz-executor4/26841:

stack backtrace:
CPU: 1 PID: 26841 Comm: syz-executor4 Not tainted 4.18.0-next-20180823+ #46
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
print_circular_bug.isra.34.cold.55+0x1bd/0x27d kernel/locking/lockdep.c:1222
check_prev_add kernel/locking/lockdep.c:1862 [inline]
check_prevs_add kernel/locking/lockdep.c:1975 [inline]
validate_chain kernel/locking/lockdep.c:2416 [inline]
__lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3412
lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
__alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
register_netdevice+0x337/0x1100 net/core/dev.c:8410
bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:622 [inline]
sock_sendmsg+0xd5/0x120 net/socket.c:632
___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
__sys_sendmsg+0x11d/0x290 net/socket.c:2153
__do_sys_sendmsg net/socket.c:2162 [inline]
__se_sys_sendmsg net/socket.c:2160 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457089
Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2df20a5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f2df20a66d4 RCX: 0000000000457089
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
RBP: 0000000000930140 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004d40b8 R14: 00000000004c8ad8 R15: 0000000000000001

Signed-off-by: Mahesh Bandewar
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Mahesh Bandewar
2018-10-18 15:16:17 +0800
e73b51a99 bnxt_en: free hwrm resources, if driver probe fails. ... Browse Code »

[ Upstream commit a2bf74f4e1b82395dad2b08d2a911d9151db71c1 ]

When the driver probe fails, all the resources that were allocated prior
to the failure must be freed. However, hwrm dma response memory is not
getting freed.

This patch fixes the problem described above.

Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Venkat Duvvuru
Signed-off-by: Michael Chan
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Venkat Duvvuru
2018-10-18 15:16:17 +0800
67d1ee6c7 bnxt_en: Fix TX timeout during netpoll. ... Browse Code »

[ Upstream commit 73f21c653f930f438d53eed29b5e4c65c8a0f906 ]

The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events. bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting. In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.

We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.

Also, the logic to ACK the completion ring in case it is almost filled
with TX completions need to be adjusted to take care of the 0 budget
case, as discussed with Eric Dumazet

Reported-by: Song Liu
Reviewed-by: Song Liu
Tested-by: Song Liu
Signed-off-by: Michael Chan
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Michael Chan
2018-10-18 15:16:17 +0800

13 Oct, 2018

23 commits

c03f0ab15 ath10k: fix scan crash due to incorrect length calculation ... Browse Code »

commit c8291988806407e02a01b4b15b4504eafbcc04e0 upstream.

Length of WMI scan message was not calculated correctly. The allocated
buffer was smaller than what we expected. So WMI message corrupted
skb_info, which is at the end of skb->data. This fix takes TLV header
into account even if the element is zero-length.

Crash log:
[49.629986] Unhandled kernel unaligned access[#1]:
[49.634932] CPU: 0 PID: 1176 Comm: logd Not tainted 4.4.60 #180
[49.641040] task: 83051460 ti: 8329c000 task.ti: 8329c000
[49.646608] $ 0 : 00000000 00000001 80984a80 00000000
[49.652038] $ 4 : 45259e89 8046d484 8046df30 8024ba70
[49.657468] $ 8 : 00000000 804cc4c0 00000001 20306320
[49.662898] $12 : 33322037 000110f2 00000000 31203930
[49.668327] $16 : 82792b40 80984a80 00000001 804207fc
[49.673757] $20 : 00000000 0000012c 00000040 80470000
[49.679186] $24 : 00000000 8024af7c
[49.684617] $28 : 8329c000 8329db88 00000001 802c58d0
[49.690046] Hi : 00000000
[49.693022] Lo : 453c0000
[49.696013] epc : 800efae4 put_page+0x0/0x58
[49.700615] ra : 802c58d0 skb_release_data+0x148/0x1d4
[49.706184] Status: 1000fc03 KERNEL EXL IE
[49.710531] Cause : 00800010 (ExcCode 04)
[49.714669] BadVA : 45259e89
[49.717644] PrId : 00019374 (MIPS 24Kc)

Signed-off-by: Zhi Chen
Signed-off-by: Kalle Valo
Cc: Brian Norris
Signed-off-by: Greg Kroah-Hartman

Zhi Chen
2018-10-13 15:27:30 +0800
711b942ae virtio_balloon: fix increment of vb->num_pfns in fill_balloon() ... Browse Code »

commit d9e427f6ab8142d6868eb719e6a7851aafea56b6 upstream.

commit c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
changed code to increment vb->num_pfns before call to
set_page_pfns(), which used to happen only after.

This patch fixes boot hang for me on ppc64le KVM guests.

Fixes: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
Cc: Michael S. Tsirkin
Cc: Tetsuo Handa
Cc: Michal Hocko
Cc: Wei Wang
Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Sudip Mukherjee
Signed-off-by: Greg Kroah-Hartman

Jan Stancek
2018-10-13 15:27:30 +0800
7f42eada5 virtio_balloon: fix deadlock on OOM ... Browse Code »

commit c7cdff0e864713a089d7cb3a2b1136ba9a54881a upstream.

fill_balloon doing memory allocations under balloon_lock
can cause a deadlock when leak_balloon is called from
virtballoon_oom_notify and tries to take same lock.

To fix, split page allocation and enqueue and do allocations outside the lock.

Here's a detailed analysis of the deadlock by Tetsuo Handa:

In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
serialize against fill_balloon(). But in fill_balloon(),
alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
is specified, this allocation attempt might indirectly depend on somebody
else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
__GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
will cause OOM lockup.

Thread1 Thread2
fill_balloon()
takes a balloon_lock
balloon_page_enqueue()
alloc_page(GFP_HIGHUSER_MOVABLE)
direct reclaim (__GFP_FS context) takes a fs lock
waits for that fs lock alloc_page(GFP_NOFS)
__alloc_pages_may_oom()
takes the oom_lock
out_of_memory()
blocking_notifier_call_chain()
leak_balloon()
tries to take that balloon_lock and deadlocks

Reported-by: Tetsuo Handa
Cc: Michal Hocko
Cc: Wei Wang
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Sudip Mukherjee
Signed-off-by: Greg Kroah-Hartman

Michael S. Tsirkin
2018-10-13 15:27:30 +0800
5656b7354 ucma: fix a use-after-free in ucma_resolve_ip() ... Browse Code »

commit 5fe23f262e0548ca7f19fb79f89059a60d087d22 upstream.

There is a race condition between ucma_close() and ucma_resolve_ip():

CPU0 CPU1
ucma_resolve_ip(): ucma_close():

ctx = ucma_get_ctx(file, cmd.id);

list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) {
mutex_lock(&mut);
idr_remove(&ctx_idr, ctx->id);
mutex_unlock(&mut);
...
mutex_lock(&mut);
if (!ctx->closing) {
mutex_unlock(&mut);
rdma_destroy_id(ctx->cm_id);
...
ucma_free_ctx(ctx);

ret = rdma_resolve_addr();
ucma_put_ctx(ctx);

Before idr_remove(), ucma_get_ctx() could still find the ctx
and after rdma_destroy_id(), rdma_resolve_addr() may still
access id_priv pointer. Also, ucma_put_ctx() may use ctx after
ucma_free_ctx() too.

ucma_close() should call ucma_put_ctx() too which tests the
refcnt and waits for the last one releasing it. The similar
pattern is already used by ucma_destroy_id().

Reported-and-tested-by: syzbot+da2591e115d57a9cbb8b@syzkaller.appspotmail.com
Reported-by: syzbot+cfe3c1e8ef634ba8964b@syzkaller.appspotmail.com
Cc: Jason Gunthorpe
Cc: Doug Ledford
Cc: Leon Romanovsky
Signed-off-by: Cong Wang
Reviewed-by: Leon Romanovsky
Signed-off-by: Doug Ledford
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2018-10-13 15:27:29 +0800
75fc05a20 crypto: chelsio - Fix memory corruption in DMA Mapped buffers. ... Browse Code »

commit add92a817e60e308a419693413a38d9d1e663aff upstream.

Update PCI Id in "cpl_rx_phys_dsgl" header. In case pci_chan_id and
tx_chan_id are not derived from same queue, H/W can send request
completion indication before completing DMA Transfer.

Herbert, It would be good if fix can be merge to stable tree.
For 4.14 kernel, It requires some update to avoid mege conficts.

Cc:
Signed-off-by: Harsh Jain
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Harsh Jain
2018-10-13 15:27:28 +0800
0f6e2f4e0 nvme_fc: fix ctrl create failures racing with workq items ... Browse Code »

commit cf25809bec2c7df4b45df5b2196845d9a4a3c89b upstream.

If there are errors during initial controller create, the transport
will teardown the partially initialized controller struct and free
the ctlr memory. Trouble is - most of those errors can occur due
to asynchronous events happening such io timeouts and subsystem
connectivity failures. Those failures invoke async workq items to
reset the controller and attempt reconnect. Those may be in progress
as the main thread frees the ctrl memory, resulting in NULL ptr oops.

Prevent this from happening by having the main ctrl failure thread
changing state to DELETING followed by synchronously cancelling any
pending queued work item. The change of state will prevent the
scheduling of resets or reconnect events.

Signed-off-by: James Smart
Signed-off-by: Keith Busch
Signed-off-by: Jens Axboe
Signed-off-by: Amit Pundir
Signed-off-by: Greg Kroah-Hartman

James Smart
2018-10-13 15:27:28 +0800
1b2ad48a8 ath10k: fix kernel panic issue during pci probe ... Browse Code »

commit 50e79e25250bf928369996277e85b00536b380c7 upstream.

If device gone during chip reset, ar->normal_mode_fw.board is not
initialized, but ath10k_debug_print_hwfw_info() will try to access its
member, which will cause 'kernel NULL pointer' issue. This was found
using a faulty device (pci link went down sometimes) in a random
insmod/rmmod/other-op test.
To fix it, check ar->normal_mode_fw.board before accessing the member.

pci 0000:02:00.0: BAR 0: assigned [mem 0xf7400000-0xf75fffff 64bit]
ath10k_pci 0000:02:00.0: enabling device (0000 -> 0002)
ath10k_pci 0000:02:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
ath10k_pci 0000:02:00.0: failed to read device register, device is gone
ath10k_pci 0000:02:00.0: failed to wait for target init: -5
ath10k_pci 0000:02:00.0: failed to warm reset: -5
ath10k_pci 0000:02:00.0: firmware crashed during chip reset
ath10k_pci 0000:02:00.0: firmware crashed! (uuid 5d018951-b8e1-404a-8fde-923078b4423a)
ath10k_pci 0000:02:00.0: (null) target 0x00000000 chip_id 0x00340aff sub 0000:0000
ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
ath10k_pci 0000:02:00.0: firmware ver api 0 features crc32 00000000
...
BUG: unable to handle kernel NULL pointer dereference at 00000004
...
Call Trace:
[] ath10k_print_driver_info+0x12/0x20 [ath10k_core]
[] ath10k_pci_fw_crashed_dump+0x6d/0x4d0 [ath10k_pci]
[] ? ath10k_pci_sleep.part.19+0x57/0xc0 [ath10k_pci]
[] ath10k_pci_hif_power_up+0x14e/0x1b0 [ath10k_pci]
[] ? do_page_fault+0xb/0x10
[] ath10k_core_register_work+0x24/0x840 [ath10k_core]
[] ? netlbl_unlhsh_remove+0x178/0x410
[] ? __do_page_fault+0x480/0x480
[] process_one_work+0x114/0x3e0
[] worker_thread+0x37/0x4a0
[] kthread+0xa4/0xc0
[] ? create_worker+0x180/0x180
[] ? kthread_park+0x50/0x50
[] ret_from_fork+0x1b/0x28
Code: 78 80 b8 50 09 00 00 00 75 5d 8d 75 94 c7 44 24 08 aa d7 52 fb c7 44 24 04 64 00 00 00
89 34 24 e8 82 52 e2 c5 8b 83 dc 08 00 00 50 04 8b 08 31 c0 e8 20 57 e3 c5 89 44 24 10 8b 83 58 09 00
EIP: []-
ath10k_debug_print_board_info+0x34/0xb0 [ath10k_core]
SS:ESP 0068:f4921d90
CR2: 0000000000000004

Signed-off-by: Yu Wang
Signed-off-by: Kalle Valo
[AmitP: Minor rebasing for 4.14.y and 4.9.y]
Signed-off-by: Amit Pundir
Signed-off-by: Greg Kroah-Hartman

Yu Wang
2018-10-13 15:27:27 +0800
8146256b7 ath10k: fix use-after-free in ath10k_wmi_cmd_send_nowait ... Browse Code »

commit 9ef0f58ed7b4a55da4a64641d538e0d9e46579ac upstream.

The skb may be freed in tx completion context before
trace_ath10k_wmi_cmd is called. This can be easily captured when
KASAN(Kernel Address Sanitizer) is enabled. The fix is to move
trace_ath10k_wmi_cmd before the send operation. As the ret has no
meaning in trace_ath10k_wmi_cmd then, so remove this parameter too.

Signed-off-by: Carl Huang
Tested-by: Brian Norris
Reviewed-by: Brian Norris
Signed-off-by: Kalle Valo
Signed-off-by: Amit Pundir
Signed-off-by: Greg Kroah-Hartman

Carl Huang
2018-10-13 15:27:27 +0800
79f87e09b of: unittest: Disable interrupt node tests for old world MAC systems ... Browse Code »

commit 8894891446c9380709451b99ab45c5c53adfd2fc upstream.

On systems with OF_IMAP_OLDWORLD_MAC set in of_irq_workarounds, the
devicetree interrupt parsing code is different, causing unit tests of
devicetree interrupt nodes to fail. Due to a bug in unittest code, which
tries to dereference an uninitialized pointer, this results in a crash.

OF: /testcase-data/phandle-tests/consumer-a: arguments longer than property
Unable to handle kernel paging request for data at address 0x00bc616e
Faulting instruction address: 0xc08e9468
Oops: Kernel access of bad area, sig: 11 [#1]
BE PREEMPT PowerMac
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.14.72-rc1-yocto-standard+ #1
task: cf8e0000 task.stack: cf8da000
NIP: c08e9468 LR: c08ea5bc CTR: c08ea5ac
REGS: cf8dbb50 TRAP: 0300 Not tainted (4.14.72-rc1-yocto-standard+)
MSR: 00001032 CR: 82004044 XER: 00000000
DAR: 00bc616e DSISR: 40000000
GPR00: c08ea5bc cf8dbc00 cf8e0000 c13ca517 c13ca517 c13ca8a0 00000066 00000002
GPR08: 00000063 00bc614e c0b05865 000affff 82004048 00000000 c00047f0 00000000
GPR16: c0a80000 c0a9cc34 c13ca517 c0ad1134 05ffffff 000affff c0b05860 c0abeef8
GPR24: cecec278 cecec278 c0a8c4d0 c0a885e0 c13ca8a0 05ffffff c13ca8a0 c13ca517

NIP [c08e9468] device_node_gen_full_name+0x30/0x15c
LR [c08ea5bc] device_node_string+0x190/0x3c8
Call Trace:
[cf8dbc00] [c007f670] trace_hardirqs_on_caller+0x118/0x1fc (unreliable)
[cf8dbc40] [c08ea5bc] device_node_string+0x190/0x3c8
[cf8dbcb0] [c08eb794] pointer+0x25c/0x4d0
[cf8dbd00] [c08ebcbc] vsnprintf+0x2b4/0x5ec
[cf8dbd60] [c08ec00c] vscnprintf+0x18/0x48
[cf8dbd70] [c008e268] vprintk_store+0x4c/0x22c
[cf8dbda0] [c008ecac] vprintk_emit+0x94/0x130
[cf8dbdd0] [c008ff54] printk+0x5c/0x6c
[cf8dbe10] [c0b8ddd4] of_unittest+0x2220/0x26f8
[cf8dbea0] [c0004434] do_one_initcall+0x4c/0x184
[cf8dbf00] [c0b4534c] kernel_init_freeable+0x13c/0x1d8
[cf8dbf30] [c0004814] kernel_init+0x24/0x118
[cf8dbf40] [c0013398] ret_from_kernel_thread+0x5c/0x64

The problem was observed when running a qemu test for the g3beige machine
with devicetree unittests enabled.

Disable interrupt node tests on affected systems to avoid both false
unittest failures and the crash.

With this patch in place, unittest on the affected system passes with
the following message.

dt-test ### end of unittest - 144 passed, 0 failed

Fixes: 53a42093d96ef ("of: Add device tree selftests")
Signed-off-by: Guenter Roeck
Reviewed-by: Frank Rowand
Signed-off-by: Rob Herring
Signed-off-by: Greg Kroah-Hartman

Guenter Roeck
2018-10-13 15:27:27 +0800
171f90d4a tty: Drop tty->count on tty_reopen() failure ... Browse Code »

commit fe32416790093b31364c08395727de17ec96ace1 upstream.

In case of tty_ldisc_reinit() failure, tty->count should be decremented
back, otherwise we will never release_tty().
Tetsuo reported that it fixes noisy warnings on tty release like:
pts pts4033: tty_release: tty->count(10529) != (#fd's(7) + #kopen's(0))

Fixes: commit 892d1fa7eaae ("tty: Destroy ldisc instance on hangup")

Cc: stable@vger.kernel.org # v4.6+
Cc: Greg Kroah-Hartman
Cc: Jiri Slaby
Reviewed-by: Jiri Slaby
Tested-by: Jiri Slaby
Tested-by: Mark Rutland
Tested-by: Tetsuo Handa
Signed-off-by: Dmitry Safonov
Signed-off-by: Greg Kroah-Hartman

Dmitry Safonov
2018-10-13 15:27:26 +0800
c92e73b11 usb: cdc_acm: Do not leak URB buffers ... Browse Code »

commit f2924d4b16ae138c2de6a0e73f526fb638330858 upstream.

When the ACM TTY port is disconnected, the URBs it uses must be killed, and
then the buffers must be freed. Unfortunately a previous refactor removed
the code freeing the buffers because it looked extremely similar to the
code killing the URBs.

As a result, there were many new leaks for each plug/unplug cycle of a
CDC-ACM device, that were detected by kmemleak.

Restore the missing code, and the memory leak is removed.

Fixes: ba8c931ded8d ("cdc-acm: refactor killing urbs")
Signed-off-by: Romain Izard
Acked-by: Oliver Neukum
Cc: stable
Signed-off-by: Greg Kroah-Hartman

Romain Izard
2018-10-13 15:27:26 +0800
821c42e7d USB: serial: simple: add Motorola Tetra MTP6550 id ... Browse Code »

commit f5fad711c06e652f90f581fc7c2caee327c33d31 upstream.

Add device-id for the Motorola Tetra radio MTP6550.

Bus 001 Device 004: ID 0cad:9012 Motorola CGISS
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0x0cad Motorola CGISS
idProduct 0x9012
bcdDevice 24.16
iManufacturer 1 Motorola Solutions, Inc.
iProduct 2 TETRA PEI interface
iSerial 0
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 55
bNumInterfaces 2
bConfigurationValue 1
iConfiguration 3 Generic Serial config
bmAttributes 0x80
(Bus Powered)
MaxPower 500mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02 EP 2 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Device Qualifier (for other device speed):
bLength 10
bDescriptorType 6
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
bNumConfigurations 1
Device Status: 0x0000
(Bus Powered)

Reported-by: Hans Hult
Cc: stable
Signed-off-by: Johan Hovold
Signed-off-by: Greg Kroah-Hartman

Johan Hovold
2018-10-13 15:27:26 +0800
35123e64a usb: xhci-mtk: resume USB3 roothub first ... Browse Code »

commit 555df5820e733cded7eb8d0bf78b2a791be51d75 upstream.

Give USB3 devices a better chance to enumerate at USB3 speeds if
they are connected to a suspended host.
Porting from "671ffdff5b13 xhci: resume USB 3 roothub first"

Cc:
Signed-off-by: Chunfeng Yun
Signed-off-by: Mathias Nyman
Signed-off-by: Greg Kroah-Hartman

Chunfeng Yun
2018-10-13 15:27:26 +0800
c096f5c4a xhci: Add missing CAS workaround for Intel Sunrise Point xHCI ... Browse Code »

commit ffe84e01bb1b38c7eb9c6b6da127a6c136d251df upstream.

The workaround for missing CAS bit is also needed for xHC on Intel
sunrisepoint PCH. For more details see:

Intel 100/c230 series PCH specification update Doc #332692-006 Errata #8

Cc:
Signed-off-by: Mathias Nyman
Signed-off-by: Greg Kroah-Hartman

Mathias Nyman
2018-10-13 15:27:26 +0800
ec6ae632e dm cache: fix resize crash if user doesn't reload cache table ... Browse Code »

commit 5d07384a666d4b2f781dc056bfeec2c27fbdf383 upstream.

A reload of the cache's DM table is needed during resize because
otherwise a crash will occur when attempting to access smq policy
entries associated with the portion of the cache that was recently
extended.

The reason is cache-size based data structures in the policy will not be
resized, the only way to safely extend the cache is to allow for a
proper cache policy initialization that occurs when the cache table is
loaded. For example the smq policy's space_init(), init_allocator(),
calc_hotspot_params() must be sized based on the extended cache size.

The fix for this is to disallow cache resizes of this pattern:
1) suspend "cache" target's device
2) resize the fast device used for the cache
3) resume "cache" target's device

Instead, the last step must be a full reload of the cache's DM table.

Fixes: 66a636356 ("dm cache: add stochastic-multi-queue (smq) policy")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Mike Snitzer
2018-10-13 15:27:25 +0800
f11a6abfd dm cache metadata: ignore hints array being too small during resize ... Browse Code »

commit 4561ffca88c546f96367f94b8f1e4715a9c62314 upstream.

Commit fd2fa9541 ("dm cache metadata: save in-core policy_hint_size to
on-disk superblock") enabled previously written policy hints to be
used after a cache is reactivated. But in doing so the cache
metadata's hint array was left exposed to out of bounds access because
on resize the metadata's on-disk hint array wasn't ever extended.

Fix this by ignoring that there are no on-disk hints associated with the
newly added cache blocks. An expanded on-disk hint array is later
rewritten upon the next clean shutdown of the cache.

Fixes: fd2fa9541 ("dm cache metadata: save in-core policy_hint_size to on-disk superblock")
Cc: stable@vger.kernel.org
Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Joe Thornber
2018-10-13 15:27:25 +0800
1364055c9 PM / core: Clear the direct_complete flag on errors ... Browse Code »

commit 69e445ab8b66a9f30519842ef18be555d3ee9b51 upstream.

If __device_suspend() runs asynchronously (in which case the device
passed to it is in dpm_suspended_list at that point) and it returns
early on an error or pending wakeup, and the power.direct_complete
flag has been set for the device already, the subsequent
device_resume() will be confused by that and it will call
pm_runtime_enable() incorrectly, as runtime PM has not been
disabled for the device by __device_suspend().

To avoid that, clear power.direct_complete if __device_suspend()
is not going to disable runtime PM for the device before returning.

Fixes: aae4518b3124 (PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily)
Reported-by: Al Cooper
Tested-by: Al Cooper
Reviewed-by: Ulf Hansson
Cc: 3.16+ # 3.16+
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Greg Kroah-Hartman

Rafael J. Wysocki
2018-10-13 15:27:25 +0800
8ebd65583 PCI: Reprogram bridge prefetch registers on resume ... Browse Code »

commit 083874549fdfefa629dfa752785e20427dde1511 upstream.

On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3
suspend/resume. The affected products include multiple generations of
NVIDIA GPUs and Intel SoCs. After resume, nouveau logs many errors such
as:

fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04
[HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
DRM: failed to idle channel 0 [DRM]

Similarly, the NVIDIA proprietary driver also fails after resume (black
screen, 100% CPU usage in Xorg process). We shipped a sample to NVIDIA for
diagnosis, and their response indicated that it's a problem with the parent
PCI bridge (on the Intel SoC), not the GPU.

Runtime suspend/resume works fine, only S3 suspend is affected.

We found a workaround: on resume, rewrite the Intel PCI bridge
'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In the
cases that I checked, this register has value 0 and we just have to rewrite
that value.

Linux already saves and restores PCI config space during suspend/resume,
but this register was being skipped because upon resume, it already has
value 0 (the correct, pre-suspend value).

Intel appear to have previously acknowledged this behaviour and the
requirement to rewrite this register:
https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23

Based on that, rewrite the prefetch register values even when that appears
unnecessary.

We have confirmed this solution on all the affected models we have in-hands
(X542UQ, UX533FD, X530UN, V272UN).

Additionally, this solves an issue where r8169 MSI-X interrupts were broken
after S3 suspend/resume on ASUS X441UAR. This issue was recently worked
around in commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e"). It
also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop
that we had not yet patched. I suspect it will also fix the issue that was
worked around in commit 7c53a722459c ("r8169: don't use MSI-X on
RTL8168g").

Thomas Martitz reports that this change also solves an issue where the AMD
Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3
suspend/resume.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
Signed-off-by: Daniel Drake
Signed-off-by: Bjorn Helgaas
Reviewed-by: Rafael J. Wysocki
Reviewed-By: Peter Wu
CC: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

Daniel Drake
2018-10-13 15:27:24 +0800
71a055625 drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set ... Browse Code »

commit 337fe9f5c1e7de1f391c6a692531379d2aa2ee11 upstream.

We attempt to get fences earlier in the hopes that everything will
already have fences and no callbacks will be needed. If we do succeed
in getting a fence, getting one a second time will result in a duplicate
ref with no unref. This is causing memory leaks in Vulkan applications
that create a lot of fences; playing for a few hours can, apparently,
bring down the system.

Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107899
Reviewed-by: Chris Wilson
Signed-off-by: Jason Ekstrand
Signed-off-by: Sean Paul
Link: https://patchwork.freedesktop.org/patch/msgid/20180926071703.15257-1-jason.ekstrand@intel.com
Signed-off-by: Greg Kroah-Hartman

Jason Ekstrand
2018-10-13 15:27:23 +0800
0c0dd182a drm/amdgpu: Fix vce work queue was not cancelled when suspend ... Browse Code »

commit 61ea6f5831974ebd1a57baffd7cc30600a2e26fc upstream.

The vce cancel_delayed_work_sync never be called.
driver call the function in error path.

This caused the A+A suspend hang when runtime pm enebled.
As we will visit the smu in the idle queue. this will cause
smu hang because the dgpu has been suspend, and the dgpu also
will be waked up. As the smu has been hang, so the dgpu resume
will failed.

Reviewed-by: Christian König
Reviewed-by: Feifei Xu
Signed-off-by: Rex Zhu
Signed-off-by: Alex Deucher
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman

Rex Zhu
2018-10-13 15:27:23 +0800
309a1c5cf xen-netback: fix input validation in xenvif_set_hash_mapping() ... Browse Code »

commit 780e83c259fc33e8959fed8dfdad17e378d72b62 upstream.

Both len and off are frontend specified values, so we need to make
sure there's no overflow when adding the two for the bounds check. We
also want to avoid undefined behavior and hence use off to index into
->hash.mapping[] only after bounds checking. This at the same time
allows to take care of not applying off twice for the bounds checking
against vif->num_queues.

It is also insufficient to bounds check copy_op.len, as this is len
truncated to 16 bits.

This is XSA-270 / CVE-2018-15471.

Reported-by: Felix Wilhelm
Signed-off-by: Jan Beulich
Reviewed-by: Paul Durrant
Tested-by: Paul Durrant
Cc: stable@vger.kernel.org [4.7 onwards]
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jan Beulich
2018-10-13 15:27:23 +0800
f66d89483 fbdev/omapfb: fix omapfb_memory_read infoleak ... Browse Code »

commit 1bafcbf59fed92af58955024452f45430d3898c5 upstream.

OMAPFB_MEMORY_READ ioctl reads pixels from the LCD's memory and copies
them to a userspace buffer. The code has two issues:

- The user provided width and height could be large enough to overflow
the calculations
- The copy_to_user() can copy uninitialized memory to the userspace,
which might contain sensitive kernel information.

Fix these by limiting the width & height parameters, and only copying
the amount of data that we actually received from the LCD.

Signed-off-by: Tomi Valkeinen
Reported-by: Jann Horn
Cc: stable@vger.kernel.org
Cc: security@kernel.org
Cc: Will Deacon
Cc: Jann Horn
Cc: Tony Lindgren
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Greg Kroah-Hartman

Tomi Valkeinen
2018-10-13 15:27:23 +0800
887361696 clocksource/drivers/timer-atmel-pit: Properly handle error cases ... Browse Code »

commit 52bf4a900d9cede3eb14982d0f2c5e6db6d97cc3 upstream.

The smatch utility reports a possible leak:

smatch warnings:
drivers/clocksource/timer-atmel-pit.c:183 at91sam926x_pit_dt_init() warn: possible memory leak of 'data'

Ensure data is freed before exiting with an error.

Reported-by: Dan Carpenter
Signed-off-by: Alexandre Belloni
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Lezcano
Signed-off-by: Greg Kroah-Hartman

Alexandre Belloni
2018-10-13 15:27:23 +0800

10 Oct, 2018

14 commits

4e7ea6512 dm thin metadata: fix __udivdi3 undefined on 32-bit ... Browse Code »

commit 013ad043906b2befd4a9bfb06219ed9fedd92716 upstream.

sector_div() is only viable for use with sector_t.
dm_block_t is typedef'd to uint64_t -- so use div_u64() instead.

Fixes: 3ab918281 ("dm thin metadata: try to avoid ever aborting transactions")
Signed-off-by: Mike Snitzer
Cc: Sudip Mukherjee
Signed-off-by: Greg Kroah-Hartman

Mike Snitzer
2018-10-10 14:54:28 +0800
07f79b39d ixgbe: check return value of napi_complete_done() ... Browse Code »

commit 4233cfe6ec4683497d7318f55ce7617e97f2e610 upstream.

The NIC driver should only enable interrupts when napi_complete_done()
returns true. This patch adds the check for ixgbe.

Cc: stable@vger.kernel.org # 4.10+
Suggested-by: Eric Dumazet
Signed-off-by: Song Liu
Tested-by: Andrew Bowers
Signed-off-by: Jeff Kirsher
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Song Liu
2018-10-10 14:54:28 +0800
1d24e2609 Drivers: hv: vmbus: Use get/put_cpu() in vmbus_connect() ... Browse Code »

commit 41e270f6898e7502be9fd6920ee0a108ca259d36 upstream.

With CONFIG_DEBUG_PREEMPT=y, I always see this warning:
BUG: using smp_processor_id() in preemptible [00000000]

Fix the false warning by using get/put_cpu().

Here vmbus_connect() sends a message to the host and waits for the
host's response. The host will deliver the response message and an
interrupt on CPU msg->target_vcpu, and later the interrupt handler
will wake up vmbus_connect(). vmbus_connect() doesn't really have
to run on the same cpu as CPU msg->target_vcpu, so it's safe to
call put_cpu() just here.

Signed-off-by: Dexuan Cui
Cc: stable@vger.kernel.org
Cc: K. Y. Srinivasan
Cc: Haiyang Zhang
Cc: Stephen Hemminger
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Dexuan Cui
2018-10-10 14:54:28 +0800
119bf9470 gpiolib: Free the last requested descriptor ... Browse Code »

commit 19a4fbffc94e41abaa2a623a25ce2641d69eccf0 upstream.

The current code only frees N-1 gpios if an error occurs during
gpiod_set_transitory, gpiod_direction_output or gpiod_direction_input.
Leading to gpios that cannot be used by userspace nor other drivers.

Cc: Timur Tabi
Cc: stable@vger.kernel.org
Fixes: ab3dbcf78f60f46d ("gpioib: do not free unrequested descriptors)
Reported-by: Jan Lorenzen
Reported-by: Jim Paris
Signed-off-by: Ricardo Ribalda Delgado
Signed-off-by: Linus Walleij
Signed-off-by: Greg Kroah-Hartman

Ricardo Ribalda Delgado
2018-10-10 14:54:28 +0800
1df517a4c crypto: caam/jr - fix ablkcipher_edesc pointer arithmetic ... Browse Code »

commit 13cc6f48c7434ce46ba6dbc90003a136a263d75a upstream.

In some cases the zero-length hw_desc array at the end of
ablkcipher_edesc struct requires for 4B of tail padding.

Due to tail padding and the way pointers to S/G table and IV
are computed:
edesc->sec4_sg = (void *)edesc + sizeof(struct ablkcipher_edesc) +
desc_bytes;
iv = (u8 *)edesc->hw_desc + desc_bytes + sec4_sg_bytes;
first 4 bytes of IV are overwritten by S/G table.

Update computation of pointer to S/G table to rely on offset of hw_desc
member and not on sizeof() operator.

Cc: # 4.13+
Fixes: 115957bb3e59 ("crypto: caam - fix IV DMA mapping and updating")
Signed-off-by: Horia Geantă
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Horia Geantă
2018-10-10 14:54:28 +0800
3b1a8535b crypto: mxs-dcp - Fix wait logic on chan threads ... Browse Code »

commit d80771c08363ad7fbf0f56f5301e7ca65065c582 upstream.

When compiling with CONFIG_DEBUG_ATOMIC_SLEEP=y the mxs-dcp driver
prints warnings such as:

WARNING: CPU: 0 PID: 120 at kernel/sched/core.c:7736 __might_sleep+0x98/0x9c
do not call blocking ops when !TASK_RUNNING; state=1 set at [] dcp_chan_thread_sha+0x3c/0x2ec

The problem is that blocking ops will manipulate current->state
themselves so it is not allowed to call them between
set_current_state(TASK_INTERRUPTIBLE) and schedule().

Fix this by converting the per-chan mutex to a spinlock (it only
protects tiny list ops anyway) and rearranging the wait logic so that
callbacks are called current->state as TASK_RUNNING. Those callbacks
will indeed call blocking ops themselves so this is required.

Cc:
Signed-off-by: Leonard Crestez
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Leonard Crestez
2018-10-10 14:54:27 +0800
90ecb7003 crypto: qat - Fix KASAN stack-out-of-bounds bug in adf_probe() ... Browse Code »

commit ba439a6cbfa2936a6713f64cb499de7943673fe3 upstream.

The following KASAN warning was printed when booting a 64-bit kernel
on some systems with Intel CPUs:

[ 44.512826] ==================================================================
[ 44.520165] BUG: KASAN: stack-out-of-bounds in find_first_bit+0xb0/0xc0
[ 44.526786] Read of size 8 at addr ffff88041e02fc50 by task kworker/0:2/124

[ 44.535253] CPU: 0 PID: 124 Comm: kworker/0:2 Tainted: G X --------- --- 4.18.0-12.el8.x86_64+debug #1
[ 44.545858] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS BKVDTRL1.86B.0005.D08.1712070559 12/07/2017
[ 44.555682] Workqueue: events work_for_cpu_fn
[ 44.560043] Call Trace:
[ 44.562502] dump_stack+0x9a/0xe9
[ 44.565832] print_address_description+0x65/0x22e
[ 44.570683] ? find_first_bit+0xb0/0xc0
[ 44.570689] kasan_report.cold.6+0x92/0x19f
[ 44.578726] find_first_bit+0xb0/0xc0
[ 44.578737] adf_probe+0x9eb/0x19a0 [qat_c62x]
[ 44.578751] ? adf_remove+0x110/0x110 [qat_c62x]
[ 44.591490] ? mark_held_locks+0xc8/0x140
[ 44.591498] ? _raw_spin_unlock+0x30/0x30
[ 44.591505] ? trace_hardirqs_on_caller+0x381/0x570
[ 44.604418] ? adf_remove+0x110/0x110 [qat_c62x]
[ 44.604427] local_pci_probe+0xd4/0x180
[ 44.604432] ? pci_device_shutdown+0x110/0x110
[ 44.617386] work_for_cpu_fn+0x51/0xa0
[ 44.621145] process_one_work+0x8fe/0x16e0
[ 44.625263] ? pwq_dec_nr_in_flight+0x2d0/0x2d0
[ 44.629799] ? lock_acquire+0x14c/0x400
[ 44.633645] ? move_linked_works+0x12e/0x2a0
[ 44.637928] worker_thread+0x536/0xb50
[ 44.641690] ? __kthread_parkme+0xb6/0x180
[ 44.645796] ? process_one_work+0x16e0/0x16e0
[ 44.650160] kthread+0x30c/0x3d0
[ 44.653400] ? kthread_create_worker_on_cpu+0xc0/0xc0
[ 44.658457] ret_from_fork+0x3a/0x50

[ 44.663557] The buggy address belongs to the page:
[ 44.668350] page:ffffea0010780bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 44.676356] flags: 0x17ffffc0000000()
[ 44.680023] raw: 0017ffffc0000000 ffffea0010780bc8 ffffea0010780bc8 0000000000000000
[ 44.687769] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 44.695510] page dumped because: kasan: bad access detected

[ 44.702578] Memory state around the buggy address:
[ 44.707372] ffff88041e02fb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.714593] ffff88041e02fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.721810] >ffff88041e02fc00: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 f2 f2
[ 44.729028] ^
[ 44.734864] ffff88041e02fc80: f2 f2 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00
[ 44.742082] ffff88041e02fd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 44.749299] ==================================================================

Looking into the code:

int ret, bar_mask;
:
for_each_set_bit(bar_nr, (const unsigned long *)&bar_mask,

It is casting a 32-bit integer pointer to a 64-bit unsigned long
pointer. There are two problems here. First, the 32-bit pointer address
may not be 64-bit aligned. Secondly, it is accessing an extra 4 bytes.

This is fixed by changing the bar_mask type to unsigned long.

Cc:
Signed-off-by: Waiman Long
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Waiman Long
2018-10-10 14:54:27 +0800
06f93e40f iommu/amd: Clear memory encryption mask from physical address ... Browse Code »

commit b3e9b515b08e407ab3a026dc2e4d935c48d05f69 upstream.

Boris Ostrovsky reported a memory leak with device passthrough when SME
is active.

The VFIO driver uses iommu_iova_to_phys() to get the physical address for
an iova. This physical address is later passed into vfio_unmap_unpin() to
unpin the memory. The vfio_unmap_unpin() uses pfn_valid() before unpinning
the memory. The pfn_valid() check was failing because encryption mask was
part of the physical address returned. This resulted in the memory not
being unpinned and therefore leaked after the guest terminates.

The memory encryption mask must be cleared from the physical address in
iommu_iova_to_phys().

Fixes: 2543a786aa25 ("iommu/amd: Allow the AMD IOMMU to work with memory encryption")
Reported-by: Boris Ostrovsky
Cc: Tom Lendacky
Cc: Joerg Roedel
Cc:
Cc: Borislav Petkov
Cc: Paolo Bonzini
Cc: Radim Krčmář
Cc: kvm@vger.kernel.org
Cc: Boris Ostrovsky
Cc: # 4.14+
Signed-off-by: Brijesh Singh
Signed-off-by: Joerg Roedel
Signed-off-by: Greg Kroah-Hartman

Singh, Brijesh
2018-10-10 14:54:27 +0800
aa41fb959 xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage ... Browse Code »

[ Upstream commit 4dca864b59dd150a221730775e2f21f49779c135 ]

This patch removes duplicate macro useage in events_base.c.

It also fixes gcc warning:
variable ‘col’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Joshua Abraham
Reviewed-by: Juergen Gross
Signed-off-by: Boris Ostrovsky
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Josh Abraham
2018-10-10 14:54:26 +0800
a502165da xen: avoid crash in disable_hotplug_cpu ... Browse Code »

[ Upstream commit 3366cdb6d350d95466ee430ac50f3c8415ca8f46 ]

The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:

BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
RIP: e030:device_offline+0x9/0xb0
Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
FS: 00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
Call Trace:
handle_vcpu_hotplug_event+0xb5/0xc0
xenwatch_thread+0x80/0x140
? wait_woken+0x80/0x80
kthread+0x112/0x130
? kthread_create_worker_on_cpu+0x40/0x40
ret_from_fork+0x3a/0x50

This happens because handle_vcpu_hotplug_event is called twice. In the
first iteration cpu_present is still true, in the second iteration
cpu_present is false which causes get_cpu_device to return NULL.
In case of cpu#0, cpu_online is apparently always true.

Fix this crash by checking if the cpu can be hotplugged, which is false
for a cpu that was just removed.

Also check if the cpu was actually offlined by device_remove, otherwise
leave the cpu_present state as it is.

Rearrange to code to do all work with device_hotplug_lock held.

Signed-off-by: Olaf Hering
Reviewed-by: Juergen Gross
Signed-off-by: Boris Ostrovsky
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Olaf Hering
2018-10-10 14:54:26 +0800
4e1494794 xen/manage: don't complain about an empty value in control/sysrq node ... Browse Code »

[ Upstream commit 87dffe86d406bee8782cac2db035acb9a28620a7 ]

When guest receives a sysrq request from the host it acknowledges it by
writing '\0' to control/sysrq xenstore node. This, however, make xenstore
watch fire again but xenbus_scanf() fails to parse empty value with "%c"
format string:

sysrq: SysRq : Emergency Sync
Emergency Sync complete
xen:manage: Error -34 reading sysrq code in control/sysrq

Ignore -ERANGE the same way we already ignore -ENOENT, empty value in
control/sysrq is totally legal.

Signed-off-by: Vitaly Kuznetsov
Reviewed-by: Wei Liu
Signed-off-by: Boris Ostrovsky
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2018-10-10 14:54:26 +0800
7d60f98cd s390/qeth: don't dump past end of unknown HW header ... Browse Code »

[ Upstream commit 0ac1487c4b2de383b91ecad1be561b8f7a2c15f4 ]

For inbound data with an unsupported HW header format, only dump the
actual HW header. We have no idea how much payload follows it, and what
it contains. Worst case, we dump past the end of the Inbound Buffer and
access whatever is located next in memory.

Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Julian Wiedmann
2018-10-10 14:54:26 +0800
d5afd6b6e s390/qeth: use vzalloc for QUERY OAT buffer ... Browse Code »

[ Upstream commit aec45e857c5538664edb76a60dd452e3265f37d1 ]

qeth_query_oat_command() currently allocates the kernel buffer for
the SIOC_QETH_QUERY_OAT ioctl with kzalloc. So on systems with
fragmented memory, large allocations may fail (eg. the qethqoat tool by
default uses 132KB).

Solve this issue by using vzalloc, backing the allocation with
non-contiguous memory.

Signed-off-by: Wenjia Zhang
Reviewed-by: Julian Wiedmann
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Wenjia Zhang
2018-10-10 14:54:26 +0800
ad2978981 r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED ... Browse Code »

[ Upstream commit 6ad569019999300afd8e614d296fdc356550b77f ]

After system suspend, sometimes the r8169 doesn't work when ethernet
cable gets pluggued.

This issue happens because rtl_reset_work() doesn't get called from
rtl8169_runtime_resume(), after system suspend.

In rtl_task(), RTL_FLAG_TASK_* only gets cleared if this condition is
met:
if (!netif_running(dev) ||
!test_bit(RTL_FLAG_TASK_ENABLED, tp->wk.flags))
...

If RTL_FLAG_TASK_ENABLED was cleared during system suspend while
RTL_FLAG_TASK_RESET_PENDING was set, the next rtl_schedule_task() won't
schedule task as the flag is still there.

So in addition to clearing RTL_FLAG_TASK_ENABLED, also clears other
flags.

Cc: Heiner Kallweit
Signed-off-by: Kai-Heng Feng
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Kai-Heng Feng
2018-10-10 14:54:26 +0800