Eric Lee / smarc-fsl-linux-kernel

30 May, 2018

40 commits

5ade3c961 mm: thp: fix potential clearing to referenced flag in page_idle_clear_pte_refs_one() ... Browse Code »

[ Upstream commit f0849ac0b8e072073ec5fcc7fadd05a77434364e ]

For PTE-mapped THP, the compound THP has not been split to normal 4K
pages yet, the whole THP is considered referenced if any one of sub page
is referenced.

When walking PTE-mapped THP by pvmw, all relevant PTEs will be checked
to retrieve referenced bit. But, the current code just returns the
result of the last PTE. If the last PTE has not referenced, the
referenced flag will be cleared.

Just set referenced when ptep{pmdp}_clear_young_notify() returns true.

Link: http://lkml.kernel.org/r/1518212451-87134-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi
Reported-by: Gang Deng
Suggested-by: Kirill A. Shutemov
Reviewed-by: Andrew Morton
Cc: "Kirill A. Shutemov"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Yang Shi
2018-05-30 13:52:24 +0800
8d700626f mm: fix races between address_space dereference and free in page_evicatable ... Browse Code »

[ Upstream commit e92bb4dd9673945179b1fc738c9817dd91bfb629 ]

When page_mapping() is called and the mapping is dereferenced in
page_evicatable() through shrink_active_list(), it is possible for the
inode to be truncated and the embedded address space to be freed at the
same time. This may lead to the following race.

CPU1 CPU2

truncate(inode) shrink_active_list()
... page_evictable(page)
truncate_inode_page(mapping, page);
delete_from_page_cache(page)
spin_lock_irqsave(&mapping->tree_lock, flags);
__delete_from_page_cache(page, NULL)
page_cache_tree_delete(..)
... mapping = page_mapping(page);
page->mapping = NULL;
...
spin_unlock_irqrestore(&mapping->tree_lock, flags);
page_cache_free_page(mapping, page)
put_page(page)
if (put_page_testzero(page)) -> false
- inode now has no pages and can be freed including embedded address_space

mapping_unevictable(mapping)
test_bit(AS_UNEVICTABLE, &mapping->flags);
- we've dereferenced mapping which is potentially already free.

Similar race exists between swap cache freeing and page_evicatable()
too.

The address_space in inode and swap cache will be freed after a RCU
grace period. So the races are fixed via enclosing the page_mapping()
and address_space usage in rcu_read_lock/unlock(). Some comments are
added in code to make it clear what is protected by the RCU read lock.

Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying"
Reviewed-by: Jan Kara
Reviewed-by: Andrew Morton
Cc: Mel Gorman
Cc: Minchan Kim
Cc: "Huang, Ying"
Cc: Johannes Weiner
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Huang Ying
2018-05-30 13:52:24 +0800
763111d9f mm/ksm: fix interaction with THP ... Browse Code »

[ Upstream commit 77da2ba0648a4fd52e5ff97b8b2b8dd312aec4b0 ]

This patch fixes a corner case for KSM. When two pages belong or
belonged to the same transparent hugepage, and they should be merged,
KSM fails to split the page, and therefore no merging happens.

This bug can be reproduced by:
* making sure ksm is running (in case disabling ksmtuned)
* enabling transparent hugepages
* allocating a THP-aligned 1-THP-sized buffer
e.g. on amd64: posix_memalign(&p, 1<<<<
Co-authored-by: Gerald Schaefer
Reviewed-by: Andrew Morton
Cc: Andrea Arcangeli
Cc: Minchan Kim
Cc: Kirill A. Shutemov
Cc: Hugh Dickins
Cc: Christian Borntraeger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Claudio Imbrenda
2018-05-30 13:52:24 +0800
378a1e49f ibmvnic: Zero used TX descriptor counter on reset ... Browse Code »

[ Upstream commit 41f714672f93608751dbd2fa2291d476a8ff0150 ]

The counter that tracks used TX descriptors pending completion
needs to be zeroed as part of a device reset. This change fixes
a bug causing transmit queues to be stopped unnecessarily and in
some cases a transmit queue stall and timeout reset. If the counter
is not reset, the remaining descriptors will not be "removed",
effectively reducing queue capacity. If the queue is over half full,
it will cause the queue to stall if stopped.

Signed-off-by: Thomas Falcon
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Thomas Falcon
2018-05-30 13:52:24 +0800
d04e5e72d dp83640: Ensure against premature access to PHY registers after reset ... Browse Code »

[ Upstream commit 76327a35caabd1a932e83d6a42b967aa08584e5d ]

The datasheet specifies a 3uS pause after performing a software
reset. The default implementation of genphy_soft_reset() does not
provide this, so implement soft_reset with the needed pause.

Signed-off-by: Esben Haabendal
Reviewed-by: Andrew Lunn
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Esben Haabendal
2018-05-30 13:52:23 +0800
4be06bc09 perf clang: Add support for recent clang versions ... Browse Code »

[ Upstream commit 7854e499f33fd9c7e63288692ffb754d9b1d02fd ]

The clang API calls used by perf have changed in recent releases and
builds succeed with libclang-3.9 only. This introduces compatibility
with libclang-4.0 and above.

Without this patch, we will see the following compilation errors with
libclang-4.0+:

util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’:
util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
Opts.Inputs.emplace_back(Path, IK_C);
^~~~
util/c++/clang.cpp: In function ‘std::unique_ptr perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr)’:
util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
Clang.setInvocation(&*CI);
^
In file included from util/c++/clang.cpp:14:0:
/usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr)
void setInvocation(std::shared_ptr Value);
^~~~~~~~~~~~~

Committer testing:

Tested on Fedora 27 after installing the clang-devel and llvm-devel
packages, versions:

# rpm -qa | egrep llvm\|clang
llvm-5.0.1-6.fc27.x86_64
clang-libs-5.0.1-5.fc27.x86_64
clang-5.0.1-5.fc27.x86_64
clang-tools-extra-5.0.1-5.fc27.x86_64
llvm-libs-5.0.1-6.fc27.x86_64
llvm-devel-5.0.1-6.fc27.x86_64
clang-devel-5.0.1-5.fc27.x86_64
#

Make sure you don't have some older version lying around in /usr/local,
etc, then:

$ make LIBCLANGLLVM=1 -C tools/perf install-bin

And in the end perf will be linked agains these libraries:

# ldd ~/bin/perf | egrep -i llvm\|clang
libclangAST.so.5 => /lib64/libclangAST.so.5 (0x00007f8bb2eb4000)
libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x00007f8bb29e3000)
libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x00007f8bb23f7000)
libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x00007f8bb2060000)
libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x00007f8bb1d06000)
libclangLex.so.5 => /lib64/libclangLex.so.5 (0x00007f8bb1a3e000)
libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x00007f8bb17d4000)
libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x00007f8bb15c5000)
libclangSema.so.5 => /lib64/libclangSema.so.5 (0x00007f8bb0cc9000)
libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x00007f8bb0a23000)
libclangParse.so.5 => /lib64/libclangParse.so.5 (0x00007f8bb0725000)
libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x00007f8bb039a000)
libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x00007f8bace98000)
libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x00007f8bab735000)
libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x00007f8bab4b2000)
libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x00007f8bab2a1000)
libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x00007f8bab08e000)
#

Signed-off-by: Sandipan Das
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Naveen N. Rao
Fixes: 00b86691c77c ("perf clang: Add builtin clang support ant test case")
Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Sandipan Das
2018-05-30 13:52:23 +0800
ee7c28b28 perf tools: Fix perf builds with clang support ... Browse Code »

[ Upstream commit c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e ]

For libclang, some distro packages provide static libraries (.a) while
some provide shared libraries (.so). Currently, perf code can only be
linked with static libraries. This makes perf build possible for both
cases.

Signed-off-by: Sandipan Das
Cc: Jiri Olsa
Cc: Naveen N. Rao
Fixes: d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking support")
Link: http://lkml.kernel.org/r/20180404180419.19056-1-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Sandipan Das
2018-05-30 13:52:23 +0800
6689a4c7b powerpc/fscr: Enable interrupts earlier before calling get_user() ... Browse Code »

[ Upstream commit 709b973c844c0b4d115ac3a227a2e5a68722c912 ]

The function get_user() can sleep while trying to fetch instruction
from user address space and causes the following warning from the
scheduler.

BUG: sleeping function called from invalid context

Though interrupts get enabled back but it happens bit later after
get_user() is called. This change moves enabling these interrupts
earlier covering the function get_user(). While at this, lets check
for kernel mode and crash as this interrupt should not have been
triggered from the kernel context.

Signed-off-by: Anshuman Khandual
Signed-off-by: Michael Ellerman
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Anshuman Khandual
2018-05-30 13:52:23 +0800
96fdc64d8 cpufreq: CPPC: Initialize shared perf capabilities of CPUs ... Browse Code »

[ Upstream commit 8913315e9459b146e5888ab5138e10daa061b885 ]

When multiple CPUs are related in one cpufreq policy, the first online
CPU will be chosen by default to handle cpufreq operations. Let's take
cpu0 and cpu1 as an example.

When cpu0 is offline, policy->cpu will be shifted to cpu1. cpu1's perf
capabilities should be initialized. Otherwise, perf capabilities are 0s
and speed change can not take effect.

This patch copies perf capabilities of the first online CPU to other
shared CPUs when policy shared type is CPUFREQ_SHARED_TYPE_ANY.

Acked-by: Viresh Kumar
Signed-off-by: Shunyong Yang
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Shunyong Yang
2018-05-30 13:52:23 +0800
8bff7ca99 Force log to disk before reading the AGF during a fstrim ... Browse Code »

[ Upstream commit 8c81dd46ef3c416b3b95e3020fb90dbd44e6140b ]

Forcing the log to disk after reading the agf is wrong, we might be
calling xfs_log_force with XFS_LOG_SYNC with a metadata lock held.

This can cause a deadlock when racing a fstrim with a filesystem
shutdown.

The deadlock has been identified due a miscalculation bug in device-mapper
dm-thin, which returns lack of space to its users earlier than the device itself
really runs out of space, changing the device-mapper volume into an error state.

The problem happened while filling the filesystem with a single file,
triggering the bug in device-mapper, consequently causing an IO error
and shutting down the filesystem.

If such file is removed, and fstrim executed before the XFS finishes the
shut down process, the fstrim process will end up holding the buffer
lock, and going to sleep on the cil wait queue.

At this point, the shut down process will try to wake up all the threads
waiting on the cil wait queue, but for this, it will try to hold the
same buffer log already held my the fstrim, locking up the filesystem.

Signed-off-by: Carlos Maiolino
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Carlos Maiolino
2018-05-30 13:52:23 +0800
28143fe3e sr: get/drop reference to device in revalidate and check_events ... Browse Code »

[ Upstream commit 2d097c50212e137e7b53ffe3b37561153eeba87d ]

We can't just use scsi_cd() to get the scsi_cd structure, we have
to grab a live reference to the device. For both callbacks, we're
not inside an open where we already hold a reference to the device.

This fixes device removal/addition under concurrent device access,
which otherwise could result in the below oops.

NULL pointer dereference at 0000000000000010
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
sr 12:0:0:0: [sr2] scsi-1 drive
scsi_debug crc_t10dif crct10dif_generic crct10dif_common nvme nvme_core sb_edac xl
sr 12:0:0:0: Attached scsi CD-ROM sr2
sr_mod cdrom btrfs xor zstd_decompress zstd_compress xxhash lzo_compress zlib_defc
sr 12:0:0:0: Attached scsi generic sg7 type 5
igb ahci libahci i2c_algo_bit libata dca [last unloaded: crc_t10dif]
CPU: 43 PID: 4629 Comm: systemd-udevd Not tainted 4.16.0+ #650
Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
RIP: 0010:sr_block_revalidate_disk+0x23/0x190 [sr_mod]
RSP: 0018:ffff883ff357bb58 EFLAGS: 00010292
RAX: ffffffffa00b07d0 RBX: ffff883ff3058000 RCX: ffff883ff357bb66
RDX: 0000000000000003 RSI: 0000000000007530 RDI: ffff881fea631000
RBP: 0000000000000000 R08: ffff881fe4d38400 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000000001b6 R12: 000000000800005d
R13: 000000000800005d R14: ffff883ffd9b3790 R15: 0000000000000000
FS: 00007f7dc8e6d8c0(0000) GS:ffff883fff340000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000003ffda98005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? __invalidate_device+0x48/0x60
check_disk_change+0x4c/0x60
sr_block_open+0x16/0xd0 [sr_mod]
__blkdev_get+0xb9/0x450
? iget5_locked+0x1c0/0x1e0
blkdev_get+0x11e/0x320
? bdget+0x11d/0x150
? _raw_spin_unlock+0xa/0x20
? bd_acquire+0xc0/0xc0
do_dentry_open+0x1b0/0x320
? inode_permission+0x24/0xc0
path_openat+0x4e6/0x1420
? cpumask_any_but+0x1f/0x40
? flush_tlb_mm_range+0xa0/0x120
do_filp_open+0x8c/0xf0
? __seccomp_filter+0x28/0x230
? _raw_spin_unlock+0xa/0x20
? __handle_mm_fault+0x7d6/0x9b0
? list_lru_add+0xa8/0xc0
? _raw_spin_unlock+0xa/0x20
? __alloc_fd+0xaf/0x160
? do_sys_open+0x1a6/0x230
do_sys_open+0x1a6/0x230
do_syscall_64+0x5a/0x100
entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Reviewed-by: Lee Duncan
Reviewed-by: Jan Kara
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jens Axboe
2018-05-30 13:52:23 +0800
3a0de65ac z3fold: fix memory leak ... Browse Code »

[ Upstream commit 1ec6995d1290bfb87cc3a51f0836c889e857cef9 ]

In z3fold_create_pool(), the memory allocated by __alloc_percpu() is not
released on the error path that pool->compact_wq , which holds the
return value of create_singlethread_workqueue(), is NULL. This will
result in a memory leak bug.

[akpm@linux-foundation.org: fix oops on kzalloc() failure, check __alloc_percpu() retval]
Link: http://lkml.kernel.org/r/1522803111-29209-1-git-send-email-wangxidong_97@163.com
Signed-off-by: Xidong Wang
Reviewed-by: Andrew Morton
Cc: Vitaly Wool
Cc: Mike Rapoport
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Xidong Wang
2018-05-30 13:52:23 +0800
2ab773810 swap: divide-by-zero when zero length swap file on ssd ... Browse Code »

[ Upstream commit a06ad633a37c64a0cd4c229fc605cee8725d376e ]

Calling swapon() on a zero length swap file on SSD can lead to a
divide-by-zero.

Although creating such files isn't possible with mkswap and they woud be
considered invalid, it would be better for the swapon code to be more
robust and handle this condition gracefully (return -EINVAL).
Especially since the fix is small and straightforward.

To help with wear leveling on SSD, the swapon syscall calculates a
random position in the swap file using modulo p->highest_bit, which is
set to maxpages - 1 in read_swap_header.

If the swap file is zero length, read_swap_header sets maxpages=1 and
last_page=0, resulting in p->highest_bit=0 and we divide-by-zero when we
modulo p->highest_bit in swapon syscall.

This can be prevented by having read_swap_header return zero if
last_page is zero.

Link: http://lkml.kernel.org/r/5AC747C1020000A7001FA82C@prv-mh.provo.novell.com
Signed-off-by: Thomas Abraham
Reported-by:
Reviewed-by: Andrew Morton
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Tom Abraham
2018-05-30 13:52:23 +0800
9c9844d9c fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table ... Browse Code »

[ Upstream commit a0b0d1c345d0317efe594df268feb5ccc99f651e ]

proc_sys_link_fill_cache() does not take currently unregistering sysctl
tables into account, which might result into a page fault in
sysctl_follow_link() - add a check to fix it.

This bug has been present since v3.4.

Link: http://lkml.kernel.org/r/20180228013506.4915-1-danilokrummrich@dk-develop.de
Fixes: 0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets")
Signed-off-by: Danilo Krummrich
Acked-by: Kees Cook
Reviewed-by: Andrew Morton
Cc: "Luis R . Rodriguez"
Cc: "Eric W. Biederman"
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Danilo Krummrich
2018-05-30 13:52:23 +0800
59bdc5872 x86/mm: Do not forbid _PAGE_RW before init for __ro_after_init ... Browse Code »

[ Upstream commit 639d6aafe437a7464399d2a77d006049053df06f ]

__ro_after_init data gets stuck in the .rodata section. That's normally
fine because the kernel itself manages the R/W properties.

But, if we run __change_page_attr() on an area which is __ro_after_init,
the .rodata checks will trigger and force the area to be immediately
read-only, even if it is early-ish in boot. This caused problems when
trying to clear the _PAGE_GLOBAL bit for these area in the PTI code:
it cleared _PAGE_GLOBAL like I asked, but also took it up on itself
to clear _PAGE_RW. The kernel then oopses the next time it wrote to
a __ro_after_init data structure.

To fix this, add the kernel_set_to_readonly check, just like we have
for kernel text, just a few lines below in this function.

Signed-off-by: Dave Hansen
Acked-by: Kees Cook
Cc: Andrea Arcangeli
Cc: Andy Lutomirski
Cc: Arjan van de Ven
Cc: Borislav Petkov
Cc: Dan Williams
Cc: David Woodhouse
Cc: Greg Kroah-Hartman
Cc: Hugh Dickins
Cc: Josh Poimboeuf
Cc: Juergen Gross
Cc: Linus Torvalds
Cc: Nadav Amit
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20180406205514.8D898241@viggo.jf.intel.com
Signed-off-by: Ingo Molnar
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Dave Hansen
2018-05-30 13:52:22 +0800
c1af68919 x86/pgtable: Don't set huge PUD/PMD on non-leaf entries ... Browse Code »

[ Upstream commit e3e288121408c3abeed5af60b87b95c847143845 ]

The pmd_set_huge() and pud_set_huge() functions are used from
the generic ioremap() code to establish large mappings where this
is possible.

But the generic ioremap() code does not check whether the
PMD/PUD entries are already populated with a non-leaf entry,
so that any page-table pages these entries point to will be
lost.

Further, on x86-32 with SHARED_KERNEL_PMD=0, this causes a
BUG_ON() in vmalloc_sync_one() when PMD entries are synced
from swapper_pg_dir to the current page-table. This happens
because the PMD entry from swapper_pg_dir was promoted to a
huge-page entry while the current PGD still contains the
non-leaf entry. Because both entries are present and point
to a different page, the BUG_ON() triggers.

This was actually triggered with pti-x32 enabled in a KVM
virtual machine by the graphics driver.

A real and better fix for that would be to improve the
page-table handling in the generic ioremap() code. But that is
out-of-scope for this patch-set and left for later work.

Reported-by: David H. Gutteridge
Signed-off-by: Joerg Roedel
Reviewed-by: Thomas Gleixner
Cc: Andrea Arcangeli
Cc: Andy Lutomirski
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Dave Hansen
Cc: David Laight
Cc: Denys Vlasenko
Cc: Eduardo Valentin
Cc: Greg KH
Cc: Jiri Kosina
Cc: Josh Poimboeuf
Cc: Juergen Gross
Cc: Linus Torvalds
Cc: Pavel Machek
Cc: Peter Zijlstra
Cc: Waiman Long
Cc: Will Deacon
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20180411152437.GC15462@8bytes.org
Signed-off-by: Ingo Molnar
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Joerg Roedel
2018-05-30 13:52:22 +0800
c527ab91f Btrfs: fix loss of prealloc extents past i_size after fsync log replay ... Browse Code »

[ Upstream commit 471d557afed155b85da237ec46c549f443eeb5de ]

Currently if we allocate extents beyond an inode's i_size (through the
fallocate system call) and then fsync the file, we log the extents but
after a power failure we replay them and then immediately drop them.
This behaviour happens since about 2009, commit c71bf099abdd ("Btrfs:
Avoid orphan inodes cleanup while replaying log"), because it marks
the inode as an orphan instead of dropping any extents beyond i_size
before replaying logged extents, so after the log replay, and while
the mount operation is still ongoing, we find the inode marked as an
orphan and then perform a truncation (drop extents beyond the inode's
i_size). Because the processing of orphan inodes is still done
right after replaying the log and before the mount operation finishes,
the intention of that commit does not make any sense (at least as
of today). However reverting that behaviour is not enough, because
we can not simply discard all extents beyond i_size and then replay
logged extents, because we risk dropping extents beyond i_size created
in past transactions, for example:

add prealloc extent beyond i_size
fsync - clears the flag BTRFS_INODE_NEEDS_FULL_SYNC from the inode
transaction commit
add another prealloc extent beyond i_size
fsync - triggers the fast fsync path
power failure

In that scenario, we would drop the first extent and then replay the
second one. To fix this just make sure that all prealloc extents
beyond i_size are logged, and if we find too many (which is far from
a common case), fallback to a full transaction commit (like we do when
logging regular extents in the fast fsync path).

Trivial reproducer:

$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0xab 0 256K" /mnt/foo
$ sync
$ xfs_io -c "falloc -k 256K 1M" /mnt/foo
$ xfs_io -c "fsync" /mnt/foo

# mount to replay log
$ mount /dev/sdb /mnt
# at this point the file only has one extent, at offset 0, size 256K

A test case for fstests follows soon, covering multiple scenarios that
involve adding prealloc extents with previous shrinking truncates and
without such truncates.

Fixes: c71bf099abdd ("Btrfs: Avoid orphan inodes cleanup while replaying log")
Signed-off-by: Filipe Manana
Signed-off-by: David Sterba
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Filipe Manana
2018-05-30 13:52:22 +0800
f2924e32d Btrfs: clean up resources during umount after trans is aborted ... Browse Code »

[ Upstream commit af7227338135d2f1b1552bf9a6d43e02dcba10b9 ]

Currently if some fatal errors occur, like all IO get -EIO, resources
would be cleaned up when
a) transaction is being committed or
b) BTRFS_FS_STATE_ERROR is set

However, in some rare cases, resources may be left alone after transaction
gets aborted and umount may run into some ASSERT(), e.g.
ASSERT(list_empty(&block_group->dirty_list));

For case a), in btrfs_commit_transaciton(), there're several places at the
beginning where we just call btrfs_end_transaction() without cleaning up
resources. For case b), it is possible that the trans handle doesn't have
any dirty stuff, then only trans hanlde is marked as aborted while
BTRFS_FS_STATE_ERROR is not set, so resources remain in memory.

This makes btrfs also check BTRFS_FS_STATE_TRANS_ABORTED to make sure that
all resources won't stay in memory after umount.

Signed-off-by: Liu Bo
Signed-off-by: David Sterba
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Liu Bo
2018-05-30 13:52:22 +0800
1908ca222 nvme: don't send keep-alives to the discovery controller ... Browse Code »

[ Upstream commit 74c6c71530847808d4e3be7b205719270efee80c ]

NVMe over Fabrics 1.0 Section 5.2 "Discovery Controller Properties and
Command Support" Figure 31 "Discovery Controller – Admin Commands"
explicitly listst all commands but "Get Log Page" and "Identify" as
reserved, but NetApp report the Linux host is sending Keep Alive
commands to the discovery controller, which is a violation of the
Spec.

We're already checking for discovery controllers when configuring the
keep alive timeout but when creating a discovery controller we're not
hard wiring the keep alive timeout to 0 and thus remain on
NVME_DEFAULT_KATO for the discovery controller.

This can be easily remproduced when issuing a direct connect to the
discovery susbsystem using:
'nvme connect [...] --nqn=nqn.2014-08.org.nvmexpress.discovery'

Signed-off-by: Johannes Thumshirn
Fixes: 07bfcd09a288 ("nvme-fabrics: add a generic NVMe over Fabrics library")
Reported-by: Martin George
Reviewed-by: Christoph Hellwig
Signed-off-by: Keith Busch
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Johannes Thumshirn
2018-05-30 13:52:22 +0800
145b7e06d firmware: dmi_scan: Fix UUID length safety check ... Browse Code »

[ Upstream commit 90fe6f8ff00a07641ca893d64f75ca22ce77cca2 ]

The test which ensures that the DMI type 1 structure is long enough
to hold the UUID is off by one. It would fail if the structure is
exactly 24 bytes long, while that's sufficient to hold the UUID.

I don't expect this bug to cause problem in practice because all
implementations I have seen had length 8, 25 or 27 bytes, in line
with the SMBIOS specifications. But let's fix it still.

Signed-off-by: Jean Delvare
Fixes: a814c3597a6b ("firmware: dmi_scan: Check DMI structure length")
Reviewed-by: Mika Westerberg
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jean Delvare
2018-05-30 13:52:22 +0800
d9179b4aa sh: fix debug trap failure to process signals before return to user ... Browse Code »

[ Upstream commit 96a598996f6ac518ac79839ecbb17c91af91f4f7 ]

When responding to a debug trap (breakpoint) in userspace, the
kernel's trap handler raised SIGTRAP but returned from the trap via a
code path that ignored pending signals, resulting in an infinite loop
re-executing the trapping instruction.

Signed-off-by: Rich Felker
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Rich Felker
2018-05-30 13:52:22 +0800
4ee9130f6 net: mvneta: fix enable of all initialized RXQs ... Browse Code »

[ Upstream commit e81b5e01c14add8395dfba7130f8829206bb507d ]

In mvneta_port_up() we enable relevant RX and TX port queues by write
queues bit map to an appropriate register.

q_map must be ZERO in the beginning of this process.

Signed-off-by: Yelena Krivosheev
Acked-by: Thomas Petazzoni
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Yelena Krivosheev
2018-05-30 13:52:22 +0800
206199412 vlan: Fix vlan insertion for packets without ethernet header ... Browse Code »

[ Upstream commit c769accdf3d8a103940bea2979b65556718567e9 ]

In some situation vlan packets do not have ethernet headers. One example
is packets from tun devices. Users can specify vlan protocol in tun_pi
field instead of IP protocol. When we have a vlan device with reorder_hdr
disabled on top of the tun device, such packets from tun devices are
untagged in skb_vlan_untag() and vlan headers will be inserted back in
vlan_insert_inner_tag().

vlan_insert_inner_tag() however did not expect packets without ethernet
headers, so in such a case size argument for memmove() underflowed.

We don't need to copy headers for packets which do not have preceding
headers of vlan headers, so skip memmove() in that case.
Also don't write vlan protocol in skb->data when it does not have enough
room for it.

Fixes: cbe7128c4b92 ("vlan: Fix out of order vlan headers with reorder header off")
Signed-off-by: Toshiaki Makita
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Toshiaki Makita
2018-05-30 13:52:22 +0800
34a9a0363 net: Fix untag for vlan packets without ethernet header ... Browse Code »

[ Upstream commit ae4745730cf8e693d354ccd4dbaf59ea440c09a9 ]

In some situation vlan packets do not have ethernet headers. One example
is packets from tun devices. Users can specify vlan protocol in tun_pi
field instead of IP protocol, and skb_vlan_untag() attempts to untag such
packets.

skb_vlan_untag() (more precisely, skb_reorder_vlan_header() called by it)
however did not expect packets without ethernet headers, so in such a case
size argument for memmove() underflowed and triggered crash.

====
BUG: unable to handle kernel paging request at ffff8801cccb8000
IP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
PGD 9cee067 P4D 9cee067 PUD 1d9401063 PMD 1cccb7063 PTE 2810100028101
Oops: 000b [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 17663 Comm: syz-executor2 Not tainted 4.16.0-rc7+ #368
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
RSP: 0018:ffff8801cc046e28 EFLAGS: 00010287
RAX: ffff8801ccc244c4 RBX: fffffffffffffffe RCX: fffffffffff6c4c2
RDX: fffffffffffffffe RSI: ffff8801cccb7ffc RDI: ffff8801cccb8000
RBP: ffff8801cc046e48 R08: ffff8801ccc244be R09: ffffed0039984899
R10: 0000000000000001 R11: ffffed0039984898 R12: ffff8801ccc244c4
R13: ffff8801ccc244c0 R14: ffff8801d96b7c06 R15: ffff8801d96b7b40
FS: 00007febd562d700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8801cccb8000 CR3: 00000001ccb2f006 CR4: 00000000001606e0
DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Call Trace:
memmove include/linux/string.h:360 [inline]
skb_reorder_vlan_header net/core/skbuff.c:5031 [inline]
skb_vlan_untag+0x470/0xc40 net/core/skbuff.c:5061
__netif_receive_skb_core+0x119c/0x3460 net/core/dev.c:4460
__netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4627
netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4701
netif_receive_skb+0xae/0x390 net/core/dev.c:4725
tun_rx_batched.isra.50+0x5ee/0x870 drivers/net/tun.c:1555
tun_get_user+0x299e/0x3c20 drivers/net/tun.c:1962
tun_chr_write_iter+0xb9/0x160 drivers/net/tun.c:1990
call_write_iter include/linux/fs.h:1782 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x684/0x970 fs/read_write.c:482
vfs_write+0x189/0x510 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0xef/0x220 fs/read_write.c:581
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x454879
RSP: 002b:00007febd562cc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007febd562d6d4 RCX: 0000000000454879
RDX: 0000000000000157 RSI: 0000000020000180 RDI: 0000000000000014
RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000006b0 R14: 00000000006fc120 R15: 0000000000000000
Code: 90 90 90 90 90 90 90 48 89 f8 48 83 fa 20 0f 82 03 01 00 00 48 39 fe 7d 0f 49 89 f0 49 01 d0 49 39 f8 0f 8f 9f 00 00 00 48 89 d1 a4 c3 48 81 fa a8 02 00 00 72 05 40 38 fe 74 3b 48 83 ea 20
RIP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP: ffff8801cc046e28
CR2: ffff8801cccb8000
====

We don't need to copy headers for packets which do not have preceding
headers of vlan headers, so skip memmove() in that case.

Fixes: 4bbb3e0e8239 ("net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off")
Reported-by: Eric Dumazet
Signed-off-by: Toshiaki Makita
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Toshiaki Makita
2018-05-30 13:52:21 +0800
235ca6a03 qede: Do not drop rx-checksum invalidated packets. ... Browse Code »

[ Upstream commit 58f101bf87e32753342a6924772c6ebb0fbde24a ]

Today, driver drops received packets which are indicated as
invalid checksum by the device. Instead of dropping such packets,
pass them to the stack with CHECKSUM_NONE indication in skb.

Signed-off-by: Ariel Elior
Signed-off-by: Manish Chopra
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Manish Chopra
2018-05-30 13:52:21 +0800
78c986bf8 hv_netvsc: enable multicast if necessary ... Browse Code »

[ Upstream commit f03dbb06dc380274e351ca4b1ee1587ed4529e62 ]

My recent change to netvsc drive in how receive flags are handled
broke multicast. The Hyper-v/Azure virtual interface there is not a
multicast filter list, filtering is only all or none. The driver must
enable all multicast if any multicast address is present.

Fixes: 009f766ca238 ("hv_netvsc: filter multicast/broadcast")
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Stephen Hemminger
2018-05-30 13:52:21 +0800
28bbb0d96 mm/kmemleak.c: wait for scan completion before disabling free ... Browse Code »

[ Upstream commit 914b6dfff790544d9b77dfd1723adb3745ec9700 ]

A crash is observed when kmemleak_scan accesses the object->pointer,
likely due to the following race.

TASK A TASK B TASK C
kmemleak_write
(with "scan" and
NOT "scan=on")
kmemleak_scan()
create_object
kmem_cache_alloc fails
kmemleak_disable
kmemleak_do_cleanup
kmemleak_free_enabled = 0
kfree
kmemleak_free bails out
(kmemleak_free_enabled is 0)
slub frees object->pointer
update_checksum
crash - object->pointer
freed (DEBUG_PAGEALLOC)

kmemleak_do_cleanup waits for the scan thread to complete, but not for
direct call to kmemleak_scan via kmemleak_write. So add a wait for
kmemleak_scan completion before disabling kmemleak_free, and while at it
fix the comment on stop_scan_thread.

[vinmenon@codeaurora.org: fix stop_scan_thread comment]
Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
Signed-off-by: Vinayak Menon
Reviewed-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Vinayak Menon
2018-05-30 13:52:21 +0800
08e9dbd51 mm/vmstat.c: fix vmstat_update() preemption BUG ... Browse Code »

[ Upstream commit c7f26ccfb2c31eb1bf810ba13d044fcf583232db ]

Attempting to hotplug CPUs with CONFIG_VM_EVENT_COUNTERS enabled can
cause vmstat_update() to report a BUG due to preemption not being
disabled around smp_processor_id().

Discovered on Ubiquiti EdgeRouter Pro with Cavium Octeon II processor.

BUG: using smp_processor_id() in preemptible [00000000] code:
kworker/1:1/269
caller is vmstat_update+0x50/0xa0
CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted
4.16.0-rc4-Cavium-Octeon-00009-gf83bbd5-dirty #1
Workqueue: mm_percpu_wq vmstat_update
Call Trace:
show_stack+0x94/0x128
dump_stack+0xa4/0xe0
check_preemption_disabled+0x118/0x120
vmstat_update+0x50/0xa0
process_one_work+0x144/0x348
worker_thread+0x150/0x4b8
kthread+0x110/0x140
ret_from_kernel_thread+0x14/0x1c

Link: http://lkml.kernel.org/r/1520881552-25659-1-git-send-email-steven.hill@cavium.com
Signed-off-by: Steven J. Hill
Reviewed-by: Andrew Morton
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Steven J. Hill
2018-05-30 13:52:21 +0800
d2a5d00dc mm/page_owner: fix recursion bug after changing skip entries ... Browse Code »

[ Upstream commit 299815a4fba9f3c7a81434dba0072148f1690608 ]

This patch fixes commit 5f48f0bd4e36 ("mm, page_owner: skip unnecessary
stack_trace entries").

Because if we skip first two entries then logic of checking count value
as 2 for recursion is broken and code will go in one depth recursion.

so we need to check only one call of _RET_IP(__set_page_owner) while
checking for recursion.

Current Backtrace while checking for recursion:-

(save_stack) from (__set_page_owner) // (But recursion returns true here)
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack) // recursion should return true here
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask+)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack)
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)

Correct Backtrace with fix:

(save_stack) from (__set_page_owner) // recursion returned true here
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask+)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack)
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)

Link: http://lkml.kernel.org/r/1521607043-34670-1-git-send-email-maninder1.s@samsung.com
Fixes: 5f48f0bd4e36 ("mm, page_owner: skip unnecessary stack_trace entries")
Signed-off-by: Maninder Singh
Signed-off-by: Vaneet Narang
Acked-by: Vlastimil Babka
Cc: Michal Hocko
Cc: Oscar Salvador
Cc: Greg Kroah-Hartman
Cc: Ayush Mittal
Cc: Prakash Gupta
Cc: Vinayak Menon
Cc: Vasyl Gomonovych
Cc: Amit Sahrawat
Cc:
Cc: Vaneet Narang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Maninder Singh
2018-05-30 13:52:21 +0800
da9ec481d mm, slab: memcg_link the SLAB's kmem_cache ... Browse Code »

[ Upstream commit 880cd276dff17ea29e9a8404275c9502b265afa7 ]

All the root caches are linked into slab_root_caches which was
introduced by the commit 510ded33e075 ("slab: implement slab_root_caches
list") but it missed to add the SLAB's kmem_cache.

While experimenting with opt-in/opt-out kmem accounting, I noticed
system crashes due to NULL dereference inside cache_from_memcg_idx()
while deferencing kmem_cache.memcg_params.memcg_caches. The upstream
clean kernel will not see these crashes but SLAB should be consistent
with SLUB which does linked its boot caches (kmem_cache_node and
kmem_cache) into slab_root_caches.

Link: http://lkml.kernel.org/r/20180319210020.60289-1-shakeelb@google.com
Fixes: 510ded33e075c ("slab: implement slab_root_caches list")
Signed-off-by: Shakeel Butt
Cc: Tejun Heo
Cc: Vladimir Davydov
Cc: Greg Thelen
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: Johannes Weiner
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Shakeel Butt
2018-05-30 13:52:21 +0800
0bbd8e259 qede: Fix barrier usage after tx doorbell write. ... Browse Code »

[ Upstream commit b9fc828debc8ac2bb21b5819a44d2aea456f1c95 ]

Since commit c5ad119fb6c09b0297446be05bd66602fa564758
("net: sched: pfifo_fast use skb_array") driver is exposed
to an issue where it is hitting NULL skbs while handling TX
completions. Driver uses mmiowb() to flush the writes to the
doorbell bar which is a write-combined bar, however on x86
mmiowb() does not flush the write combined buffer.

This patch fixes this problem by replacing mmiowb() with wmb()
after the write combined doorbell write so that writes are
flushed and synchronized from more than one processor.

V1->V2:
-------
This patch was marked as "superseded" in patchwork.
(Not really sure for what reason).Resending it as v2.

Signed-off-by: Ariel Elior
Signed-off-by: Manish Chopra

Signed-off-by: David S. Miller

Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Manish Chopra
2018-05-30 13:52:21 +0800
38a85f821 builddeb: Fix header package regarding dtc source links ... Browse Code »

[ Upstream commit f8437520704cfd9cc442a99d73ed708a3cdadaf9 ]

Since d5d332d3f7e8, a couple of links in scripts/dtc/include-prefixes
are additionally required in order to build device trees with the header
package.

Signed-off-by: Jan Kiszka
Reviewed-by: Riku Voipio
Signed-off-by: Masahiro Yamada
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jan Kiszka
2018-05-30 13:52:21 +0800
5b5f4fd97 llc: properly handle dev_queue_xmit() return value ... Browse Code »

[ Upstream commit b85ab56c3f81c5a24b5a5213374f549df06430da ]

llc_conn_send_pdu() pushes the skb into write queue and
calls llc_conn_send_pdus() to flush them out. However, the
status of dev_queue_xmit() is not returned to caller,
in this case, llc_conn_state_process().

llc_conn_state_process() needs hold the skb no matter
success or failure, because it still uses it after that,
therefore we should hold skb before dev_queue_xmit() when
that skb is the one being processed by llc_conn_state_process().

For other callers, they can just pass NULL and ignore
the return value as they are.

Reported-by: Noam Rathaus
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2018-05-30 13:52:20 +0800
25801736c x86/alternatives: Fixup alternative_call_2 ... Browse Code »

[ Upstream commit bd6271039ee6f0c9b468148fc2d73e0584af6b4f ]

The following pattern fails to compile while the same pattern
with alternative_call() does:

if (...)
alternative_call_2(...);
else
alternative_call_2(...);

as it expands into

if (...)
{
};
Signed-off-by: Thomas Gleixner
Acked-by: Borislav Petkov
Link: https://lkml.kernel.org/r/20180114120504.GA11368@avx2
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Alexey Dobriyan
2018-05-30 13:52:20 +0800
06956ca1a perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs ... Browse Code »

[ Upstream commit 71eb9ee9596d8df3d5723c3cfc18774c6235e8b1 ]

this patch fix a bug in how the pebs->real_ip is handled in the PEBS
handler. real_ip only exists in Haswell and later processor. It is
actually the eventing IP, i.e., where the event occurred. As opposed
to the pebs->ip which is the PEBS interrupt IP which is always off
by one.

The problem is that the real_ip just like the IP needs to be fixed up
because PEBS does not record all the machine state registers, and
in particular the code segement (cs). This is why we have the set_linear_ip()
function. The problem was that set_linear_ip() was only used on the pebs->ip
and not the pebs->real_ip.

We have profiles which ran into invalid callstacks because of this.
Here is an example:

..... 0: ffffffffffffff80 recent entry, marker kernel v
..... 1: 000000000040044d cs=10 user_mode(regs)=0

The problem is that the kernel entry in 1: points to a user level
address. How can that be?

The reason is that with PEBS sampling the instruction that caused the event
to occur and the instruction where the CPU was when the interrupt was posted
may be far apart. And sometime during that time window, the privilege level may
change. This happens, for instance, when the PEBS sample is taken close to a
kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level
instruction. But by the time the PMU interrupt fired, the processor had already
entered kernel space. This is why the debug output shows a user address with
user_mode() false.

The problem comes from PEBS not recording the code segment (cs) register.
The register is used in x86_64 to determine if executing in kernel vs user
space. This is okay because the kernel has a software workaround called
set_linear_ip(). But the issue in setup_pebs_sample_data() is that
set_linear_ip() is never called on the real_ip value when it is available
(Haswell and later) and precise_ip > 1.

This patch fixes this problem and eliminates the callchain discrepancy.

The patch restructures the code around set_linear_ip() to minimize the number
of times the IP has to be set.

Signed-off-by: Stephane Eranian
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Vince Weaver
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Stephane Eranian
2018-05-30 13:52:20 +0800
5b3b9ce27 net/mlx5: Make eswitch support to depend on switchdev ... Browse Code »

[ Upstream commit f125376b06bcc57dfb0216ac8d6ec6d5dcf81025 ]

Add dependancy for switchdev to be congfigured as any user-space control
plane SW is expected to use the HW switchdev ID to locate the representors
related to VFs of a certain PF and apply SW/offloaded switching on them.

Fixes: e80541ecabd5 ('net/mlx5: Add CONFIG_MLX5_ESWITCH Kconfig')
Signed-off-by: Or Gerlitz
Reviewed-by: Mark Bloch
Signed-off-by: Saeed Mahameed
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Or Gerlitz
2018-05-30 13:52:20 +0800
07af604f0 net: dsa: mt7530: fix module autoloading for OF platform drivers ... Browse Code »

[ Upstream commit 3c82b372a9f44aa224b8d5106ff6f1ad516fa8a8 ]

It's required to create a modules.alias via MODULE_DEVICE_TABLE helper
for the OF platform driver. Otherwise, module autoloading cannot work.

Signed-off-by: Sean Wang
Reviewed-by: Andrew Lunn
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Sean Wang
2018-05-30 13:52:20 +0800
77c18f7ea bonding: fix the err path for dev hwaddr sync in bond_enslave ... Browse Code »

[ Upstream commit 5c78f6bfae2b10ff70e21d343e64584ea6280c26 ]

vlan_vids_add_by_dev is called right after dev hwaddr sync, so on
the err path it should unsync dev hwaddr. Otherwise, the slave
dev's hwaddr will never be unsync when this err happens.

Fixes: 1ff412ad7714 ("bonding: change the bond's vlan syncing functions with the standard ones")
Signed-off-by: Xin Long
Reviewed-by: Nikolay Aleksandrov
Acked-by: Andy Gospodarek
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Xin Long
2018-05-30 13:52:20 +0800
6da5c98d6 net: qmi_wwan: add BroadMobi BM806U 2020:2033 ... Browse Code »

[ Upstream commit 743989254ea9f132517806d8893ca9b6cf9dc86b ]

BroadMobi BM806U is an Qualcomm MDM9225 based 3G/4G modem.
Tested hardware BM806U is mounted on D-Link DWR-921-C3 router.
The USB id is added to qmi_wwan.c to allow QMI communication with
the BM806U.

Tested on 4.14 kernel and OpenWRT.

Signed-off-by: Pawel Dembicki
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Pawel Dembicki
2018-05-30 13:52:20 +0800
e78be20d1 lan78xx: Set ASD in MAC_CR when EEE is enabled. ... Browse Code »

[ Upstream commit e69647a19c870c2f919e4d5023af8a515e8ef25f ]

Description:
EEE does not work with lan7800 when AutoSpeed is not set.
(This can happen when EEPROM is not populated or configured incorrectly)

Root-Cause:
When EEE is enabled, the mac config register ASD is not set
i.e. in default state, causing EEE fail.

Fix:
Set the register when eeprom is not present.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Raghuram Chary J
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Raghuram Chary J
2018-05-30 13:52:20 +0800