30 May, 2018

40 commits

  • [ Upstream commit f0849ac0b8e072073ec5fcc7fadd05a77434364e ]

    For PTE-mapped THP, the compound THP has not been split to normal 4K
    pages yet, the whole THP is considered referenced if any one of sub page
    is referenced.

    When walking PTE-mapped THP by pvmw, all relevant PTEs will be checked
    to retrieve referenced bit. But, the current code just returns the
    result of the last PTE. If the last PTE has not referenced, the
    referenced flag will be cleared.

    Just set referenced when ptep{pmdp}_clear_young_notify() returns true.

    Link: http://lkml.kernel.org/r/1518212451-87134-1-git-send-email-yang.shi@linux.alibaba.com
    Signed-off-by: Yang Shi
    Reported-by: Gang Deng
    Suggested-by: Kirill A. Shutemov
    Reviewed-by: Andrew Morton
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Yang Shi
     
  • [ Upstream commit e92bb4dd9673945179b1fc738c9817dd91bfb629 ]

    When page_mapping() is called and the mapping is dereferenced in
    page_evicatable() through shrink_active_list(), it is possible for the
    inode to be truncated and the embedded address space to be freed at the
    same time. This may lead to the following race.

    CPU1 CPU2

    truncate(inode) shrink_active_list()
    ... page_evictable(page)
    truncate_inode_page(mapping, page);
    delete_from_page_cache(page)
    spin_lock_irqsave(&mapping->tree_lock, flags);
    __delete_from_page_cache(page, NULL)
    page_cache_tree_delete(..)
    ... mapping = page_mapping(page);
    page->mapping = NULL;
    ...
    spin_unlock_irqrestore(&mapping->tree_lock, flags);
    page_cache_free_page(mapping, page)
    put_page(page)
    if (put_page_testzero(page)) -> false
    - inode now has no pages and can be freed including embedded address_space

    mapping_unevictable(mapping)
    test_bit(AS_UNEVICTABLE, &mapping->flags);
    - we've dereferenced mapping which is potentially already free.

    Similar race exists between swap cache freeing and page_evicatable()
    too.

    The address_space in inode and swap cache will be freed after a RCU
    grace period. So the races are fixed via enclosing the page_mapping()
    and address_space usage in rcu_read_lock/unlock(). Some comments are
    added in code to make it clear what is protected by the RCU read lock.

    Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com
    Signed-off-by: "Huang, Ying"
    Reviewed-by: Jan Kara
    Reviewed-by: Andrew Morton
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: "Huang, Ying"
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Huang Ying
     
  • [ Upstream commit 77da2ba0648a4fd52e5ff97b8b2b8dd312aec4b0 ]

    This patch fixes a corner case for KSM. When two pages belong or
    belonged to the same transparent hugepage, and they should be merged,
    KSM fails to split the page, and therefore no merging happens.

    This bug can be reproduced by:
    * making sure ksm is running (in case disabling ksmtuned)
    * enabling transparent hugepages
    * allocating a THP-aligned 1-THP-sized buffer
    e.g. on amd64: posix_memalign(&p, 1<<<<
    Co-authored-by: Gerald Schaefer
    Reviewed-by: Andrew Morton
    Cc: Andrea Arcangeli
    Cc: Minchan Kim
    Cc: Kirill A. Shutemov
    Cc: Hugh Dickins
    Cc: Christian Borntraeger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Claudio Imbrenda
     
  • [ Upstream commit 41f714672f93608751dbd2fa2291d476a8ff0150 ]

    The counter that tracks used TX descriptors pending completion
    needs to be zeroed as part of a device reset. This change fixes
    a bug causing transmit queues to be stopped unnecessarily and in
    some cases a transmit queue stall and timeout reset. If the counter
    is not reset, the remaining descriptors will not be "removed",
    effectively reducing queue capacity. If the queue is over half full,
    it will cause the queue to stall if stopped.

    Signed-off-by: Thomas Falcon
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Thomas Falcon
     
  • [ Upstream commit 76327a35caabd1a932e83d6a42b967aa08584e5d ]

    The datasheet specifies a 3uS pause after performing a software
    reset. The default implementation of genphy_soft_reset() does not
    provide this, so implement soft_reset with the needed pause.

    Signed-off-by: Esben Haabendal
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Esben Haabendal
     
  • [ Upstream commit 7854e499f33fd9c7e63288692ffb754d9b1d02fd ]

    The clang API calls used by perf have changed in recent releases and
    builds succeed with libclang-3.9 only. This introduces compatibility
    with libclang-4.0 and above.

    Without this patch, we will see the following compilation errors with
    libclang-4.0+:

    util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’:
    util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
    Opts.Inputs.emplace_back(Path, IK_C);
    ^~~~
    util/c++/clang.cpp: In function ‘std::unique_ptr perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr)’:
    util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
    Clang.setInvocation(&*CI);
    ^
    In file included from util/c++/clang.cpp:14:0:
    /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr)
    void setInvocation(std::shared_ptr Value);
    ^~~~~~~~~~~~~

    Committer testing:

    Tested on Fedora 27 after installing the clang-devel and llvm-devel
    packages, versions:

    # rpm -qa | egrep llvm\|clang
    llvm-5.0.1-6.fc27.x86_64
    clang-libs-5.0.1-5.fc27.x86_64
    clang-5.0.1-5.fc27.x86_64
    clang-tools-extra-5.0.1-5.fc27.x86_64
    llvm-libs-5.0.1-6.fc27.x86_64
    llvm-devel-5.0.1-6.fc27.x86_64
    clang-devel-5.0.1-5.fc27.x86_64
    #

    Make sure you don't have some older version lying around in /usr/local,
    etc, then:

    $ make LIBCLANGLLVM=1 -C tools/perf install-bin

    And in the end perf will be linked agains these libraries:

    # ldd ~/bin/perf | egrep -i llvm\|clang
    libclangAST.so.5 => /lib64/libclangAST.so.5 (0x00007f8bb2eb4000)
    libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x00007f8bb29e3000)
    libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x00007f8bb23f7000)
    libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x00007f8bb2060000)
    libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x00007f8bb1d06000)
    libclangLex.so.5 => /lib64/libclangLex.so.5 (0x00007f8bb1a3e000)
    libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x00007f8bb17d4000)
    libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x00007f8bb15c5000)
    libclangSema.so.5 => /lib64/libclangSema.so.5 (0x00007f8bb0cc9000)
    libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x00007f8bb0a23000)
    libclangParse.so.5 => /lib64/libclangParse.so.5 (0x00007f8bb0725000)
    libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x00007f8bb039a000)
    libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x00007f8bace98000)
    libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x00007f8bab735000)
    libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x00007f8bab4b2000)
    libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x00007f8bab2a1000)
    libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x00007f8bab08e000)
    #

    Signed-off-by: Sandipan Das
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Naveen N. Rao
    Fixes: 00b86691c77c ("perf clang: Add builtin clang support ant test case")
    Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandipan@linux.vnet.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Sandipan Das
     
  • [ Upstream commit c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e ]

    For libclang, some distro packages provide static libraries (.a) while
    some provide shared libraries (.so). Currently, perf code can only be
    linked with static libraries. This makes perf build possible for both
    cases.

    Signed-off-by: Sandipan Das
    Cc: Jiri Olsa
    Cc: Naveen N. Rao
    Fixes: d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking support")
    Link: http://lkml.kernel.org/r/20180404180419.19056-1-sandipan@linux.vnet.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Sandipan Das
     
  • [ Upstream commit 709b973c844c0b4d115ac3a227a2e5a68722c912 ]

    The function get_user() can sleep while trying to fetch instruction
    from user address space and causes the following warning from the
    scheduler.

    BUG: sleeping function called from invalid context

    Though interrupts get enabled back but it happens bit later after
    get_user() is called. This change moves enabling these interrupts
    earlier covering the function get_user(). While at this, lets check
    for kernel mode and crash as this interrupt should not have been
    triggered from the kernel context.

    Signed-off-by: Anshuman Khandual
    Signed-off-by: Michael Ellerman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Anshuman Khandual
     
  • [ Upstream commit 8913315e9459b146e5888ab5138e10daa061b885 ]

    When multiple CPUs are related in one cpufreq policy, the first online
    CPU will be chosen by default to handle cpufreq operations. Let's take
    cpu0 and cpu1 as an example.

    When cpu0 is offline, policy->cpu will be shifted to cpu1. cpu1's perf
    capabilities should be initialized. Otherwise, perf capabilities are 0s
    and speed change can not take effect.

    This patch copies perf capabilities of the first online CPU to other
    shared CPUs when policy shared type is CPUFREQ_SHARED_TYPE_ANY.

    Acked-by: Viresh Kumar
    Signed-off-by: Shunyong Yang
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Shunyong Yang
     
  • [ Upstream commit 8c81dd46ef3c416b3b95e3020fb90dbd44e6140b ]

    Forcing the log to disk after reading the agf is wrong, we might be
    calling xfs_log_force with XFS_LOG_SYNC with a metadata lock held.

    This can cause a deadlock when racing a fstrim with a filesystem
    shutdown.

    The deadlock has been identified due a miscalculation bug in device-mapper
    dm-thin, which returns lack of space to its users earlier than the device itself
    really runs out of space, changing the device-mapper volume into an error state.

    The problem happened while filling the filesystem with a single file,
    triggering the bug in device-mapper, consequently causing an IO error
    and shutting down the filesystem.

    If such file is removed, and fstrim executed before the XFS finishes the
    shut down process, the fstrim process will end up holding the buffer
    lock, and going to sleep on the cil wait queue.

    At this point, the shut down process will try to wake up all the threads
    waiting on the cil wait queue, but for this, it will try to hold the
    same buffer log already held my the fstrim, locking up the filesystem.

    Signed-off-by: Carlos Maiolino
    Reviewed-by: Darrick J. Wong
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Carlos Maiolino
     
  • [ Upstream commit 2d097c50212e137e7b53ffe3b37561153eeba87d ]

    We can't just use scsi_cd() to get the scsi_cd structure, we have
    to grab a live reference to the device. For both callbacks, we're
    not inside an open where we already hold a reference to the device.

    This fixes device removal/addition under concurrent device access,
    which otherwise could result in the below oops.

    NULL pointer dereference at 0000000000000010
    PGD 0 P4D 0
    Oops: 0000 [#1] PREEMPT SMP
    Modules linked in:
    sr 12:0:0:0: [sr2] scsi-1 drive
    scsi_debug crc_t10dif crct10dif_generic crct10dif_common nvme nvme_core sb_edac xl
    sr 12:0:0:0: Attached scsi CD-ROM sr2
    sr_mod cdrom btrfs xor zstd_decompress zstd_compress xxhash lzo_compress zlib_defc
    sr 12:0:0:0: Attached scsi generic sg7 type 5
    igb ahci libahci i2c_algo_bit libata dca [last unloaded: crc_t10dif]
    CPU: 43 PID: 4629 Comm: systemd-udevd Not tainted 4.16.0+ #650
    Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
    RIP: 0010:sr_block_revalidate_disk+0x23/0x190 [sr_mod]
    RSP: 0018:ffff883ff357bb58 EFLAGS: 00010292
    RAX: ffffffffa00b07d0 RBX: ffff883ff3058000 RCX: ffff883ff357bb66
    RDX: 0000000000000003 RSI: 0000000000007530 RDI: ffff881fea631000
    RBP: 0000000000000000 R08: ffff881fe4d38400 R09: 0000000000000000
    R10: 0000000000000000 R11: 00000000000001b6 R12: 000000000800005d
    R13: 000000000800005d R14: ffff883ffd9b3790 R15: 0000000000000000
    FS: 00007f7dc8e6d8c0(0000) GS:ffff883fff340000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000010 CR3: 0000003ffda98005 CR4: 00000000003606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    ? __invalidate_device+0x48/0x60
    check_disk_change+0x4c/0x60
    sr_block_open+0x16/0xd0 [sr_mod]
    __blkdev_get+0xb9/0x450
    ? iget5_locked+0x1c0/0x1e0
    blkdev_get+0x11e/0x320
    ? bdget+0x11d/0x150
    ? _raw_spin_unlock+0xa/0x20
    ? bd_acquire+0xc0/0xc0
    do_dentry_open+0x1b0/0x320
    ? inode_permission+0x24/0xc0
    path_openat+0x4e6/0x1420
    ? cpumask_any_but+0x1f/0x40
    ? flush_tlb_mm_range+0xa0/0x120
    do_filp_open+0x8c/0xf0
    ? __seccomp_filter+0x28/0x230
    ? _raw_spin_unlock+0xa/0x20
    ? __handle_mm_fault+0x7d6/0x9b0
    ? list_lru_add+0xa8/0xc0
    ? _raw_spin_unlock+0xa/0x20
    ? __alloc_fd+0xaf/0x160
    ? do_sys_open+0x1a6/0x230
    do_sys_open+0x1a6/0x230
    do_syscall_64+0x5a/0x100
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

    Reviewed-by: Lee Duncan
    Reviewed-by: Jan Kara
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jens Axboe
     
  • [ Upstream commit 1ec6995d1290bfb87cc3a51f0836c889e857cef9 ]

    In z3fold_create_pool(), the memory allocated by __alloc_percpu() is not
    released on the error path that pool->compact_wq , which holds the
    return value of create_singlethread_workqueue(), is NULL. This will
    result in a memory leak bug.

    [akpm@linux-foundation.org: fix oops on kzalloc() failure, check __alloc_percpu() retval]
    Link: http://lkml.kernel.org/r/1522803111-29209-1-git-send-email-wangxidong_97@163.com
    Signed-off-by: Xidong Wang
    Reviewed-by: Andrew Morton
    Cc: Vitaly Wool
    Cc: Mike Rapoport
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Xidong Wang
     
  • [ Upstream commit a06ad633a37c64a0cd4c229fc605cee8725d376e ]

    Calling swapon() on a zero length swap file on SSD can lead to a
    divide-by-zero.

    Although creating such files isn't possible with mkswap and they woud be
    considered invalid, it would be better for the swapon code to be more
    robust and handle this condition gracefully (return -EINVAL).
    Especially since the fix is small and straightforward.

    To help with wear leveling on SSD, the swapon syscall calculates a
    random position in the swap file using modulo p->highest_bit, which is
    set to maxpages - 1 in read_swap_header.

    If the swap file is zero length, read_swap_header sets maxpages=1 and
    last_page=0, resulting in p->highest_bit=0 and we divide-by-zero when we
    modulo p->highest_bit in swapon syscall.

    This can be prevented by having read_swap_header return zero if
    last_page is zero.

    Link: http://lkml.kernel.org/r/5AC747C1020000A7001FA82C@prv-mh.provo.novell.com
    Signed-off-by: Thomas Abraham
    Reported-by:
    Reviewed-by: Andrew Morton
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Tom Abraham
     
  • [ Upstream commit a0b0d1c345d0317efe594df268feb5ccc99f651e ]

    proc_sys_link_fill_cache() does not take currently unregistering sysctl
    tables into account, which might result into a page fault in
    sysctl_follow_link() - add a check to fix it.

    This bug has been present since v3.4.

    Link: http://lkml.kernel.org/r/20180228013506.4915-1-danilokrummrich@dk-develop.de
    Fixes: 0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets")
    Signed-off-by: Danilo Krummrich
    Acked-by: Kees Cook
    Reviewed-by: Andrew Morton
    Cc: "Luis R . Rodriguez"
    Cc: "Eric W. Biederman"
    Cc: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Danilo Krummrich
     
  • [ Upstream commit 639d6aafe437a7464399d2a77d006049053df06f ]

    __ro_after_init data gets stuck in the .rodata section. That's normally
    fine because the kernel itself manages the R/W properties.

    But, if we run __change_page_attr() on an area which is __ro_after_init,
    the .rodata checks will trigger and force the area to be immediately
    read-only, even if it is early-ish in boot. This caused problems when
    trying to clear the _PAGE_GLOBAL bit for these area in the PTI code:
    it cleared _PAGE_GLOBAL like I asked, but also took it up on itself
    to clear _PAGE_RW. The kernel then oopses the next time it wrote to
    a __ro_after_init data structure.

    To fix this, add the kernel_set_to_readonly check, just like we have
    for kernel text, just a few lines below in this function.

    Signed-off-by: Dave Hansen
    Acked-by: Kees Cook
    Cc: Andrea Arcangeli
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: Hugh Dickins
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Nadav Amit
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20180406205514.8D898241@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Dave Hansen
     
  • [ Upstream commit e3e288121408c3abeed5af60b87b95c847143845 ]

    The pmd_set_huge() and pud_set_huge() functions are used from
    the generic ioremap() code to establish large mappings where this
    is possible.

    But the generic ioremap() code does not check whether the
    PMD/PUD entries are already populated with a non-leaf entry,
    so that any page-table pages these entries point to will be
    lost.

    Further, on x86-32 with SHARED_KERNEL_PMD=0, this causes a
    BUG_ON() in vmalloc_sync_one() when PMD entries are synced
    from swapper_pg_dir to the current page-table. This happens
    because the PMD entry from swapper_pg_dir was promoted to a
    huge-page entry while the current PGD still contains the
    non-leaf entry. Because both entries are present and point
    to a different page, the BUG_ON() triggers.

    This was actually triggered with pti-x32 enabled in a KVM
    virtual machine by the graphics driver.

    A real and better fix for that would be to improve the
    page-table handling in the generic ioremap() code. But that is
    out-of-scope for this patch-set and left for later work.

    Reported-by: David H. Gutteridge
    Signed-off-by: Joerg Roedel
    Reviewed-by: Thomas Gleixner
    Cc: Andrea Arcangeli
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: David Laight
    Cc: Denys Vlasenko
    Cc: Eduardo Valentin
    Cc: Greg KH
    Cc: Jiri Kosina
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Pavel Machek
    Cc: Peter Zijlstra
    Cc: Waiman Long
    Cc: Will Deacon
    Cc: aliguori@amazon.com
    Cc: daniel.gruss@iaik.tugraz.at
    Cc: hughd@google.com
    Cc: keescook@google.com
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20180411152437.GC15462@8bytes.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Joerg Roedel
     
  • [ Upstream commit 471d557afed155b85da237ec46c549f443eeb5de ]

    Currently if we allocate extents beyond an inode's i_size (through the
    fallocate system call) and then fsync the file, we log the extents but
    after a power failure we replay them and then immediately drop them.
    This behaviour happens since about 2009, commit c71bf099abdd ("Btrfs:
    Avoid orphan inodes cleanup while replaying log"), because it marks
    the inode as an orphan instead of dropping any extents beyond i_size
    before replaying logged extents, so after the log replay, and while
    the mount operation is still ongoing, we find the inode marked as an
    orphan and then perform a truncation (drop extents beyond the inode's
    i_size). Because the processing of orphan inodes is still done
    right after replaying the log and before the mount operation finishes,
    the intention of that commit does not make any sense (at least as
    of today). However reverting that behaviour is not enough, because
    we can not simply discard all extents beyond i_size and then replay
    logged extents, because we risk dropping extents beyond i_size created
    in past transactions, for example:

    add prealloc extent beyond i_size
    fsync - clears the flag BTRFS_INODE_NEEDS_FULL_SYNC from the inode
    transaction commit
    add another prealloc extent beyond i_size
    fsync - triggers the fast fsync path
    power failure

    In that scenario, we would drop the first extent and then replay the
    second one. To fix this just make sure that all prealloc extents
    beyond i_size are logged, and if we find too many (which is far from
    a common case), fallback to a full transaction commit (like we do when
    logging regular extents in the fast fsync path).

    Trivial reproducer:

    $ mkfs.btrfs -f /dev/sdb
    $ mount /dev/sdb /mnt
    $ xfs_io -f -c "pwrite -S 0xab 0 256K" /mnt/foo
    $ sync
    $ xfs_io -c "falloc -k 256K 1M" /mnt/foo
    $ xfs_io -c "fsync" /mnt/foo

    # mount to replay log
    $ mount /dev/sdb /mnt
    # at this point the file only has one extent, at offset 0, size 256K

    A test case for fstests follows soon, covering multiple scenarios that
    involve adding prealloc extents with previous shrinking truncates and
    without such truncates.

    Fixes: c71bf099abdd ("Btrfs: Avoid orphan inodes cleanup while replaying log")
    Signed-off-by: Filipe Manana
    Signed-off-by: David Sterba
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • [ Upstream commit af7227338135d2f1b1552bf9a6d43e02dcba10b9 ]

    Currently if some fatal errors occur, like all IO get -EIO, resources
    would be cleaned up when
    a) transaction is being committed or
    b) BTRFS_FS_STATE_ERROR is set

    However, in some rare cases, resources may be left alone after transaction
    gets aborted and umount may run into some ASSERT(), e.g.
    ASSERT(list_empty(&block_group->dirty_list));

    For case a), in btrfs_commit_transaciton(), there're several places at the
    beginning where we just call btrfs_end_transaction() without cleaning up
    resources. For case b), it is possible that the trans handle doesn't have
    any dirty stuff, then only trans hanlde is marked as aborted while
    BTRFS_FS_STATE_ERROR is not set, so resources remain in memory.

    This makes btrfs also check BTRFS_FS_STATE_TRANS_ABORTED to make sure that
    all resources won't stay in memory after umount.

    Signed-off-by: Liu Bo
    Signed-off-by: David Sterba
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • [ Upstream commit 74c6c71530847808d4e3be7b205719270efee80c ]

    NVMe over Fabrics 1.0 Section 5.2 "Discovery Controller Properties and
    Command Support" Figure 31 "Discovery Controller – Admin Commands"
    explicitly listst all commands but "Get Log Page" and "Identify" as
    reserved, but NetApp report the Linux host is sending Keep Alive
    commands to the discovery controller, which is a violation of the
    Spec.

    We're already checking for discovery controllers when configuring the
    keep alive timeout but when creating a discovery controller we're not
    hard wiring the keep alive timeout to 0 and thus remain on
    NVME_DEFAULT_KATO for the discovery controller.

    This can be easily remproduced when issuing a direct connect to the
    discovery susbsystem using:
    'nvme connect [...] --nqn=nqn.2014-08.org.nvmexpress.discovery'

    Signed-off-by: Johannes Thumshirn
    Fixes: 07bfcd09a288 ("nvme-fabrics: add a generic NVMe over Fabrics library")
    Reported-by: Martin George
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Keith Busch
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johannes Thumshirn
     
  • [ Upstream commit 90fe6f8ff00a07641ca893d64f75ca22ce77cca2 ]

    The test which ensures that the DMI type 1 structure is long enough
    to hold the UUID is off by one. It would fail if the structure is
    exactly 24 bytes long, while that's sufficient to hold the UUID.

    I don't expect this bug to cause problem in practice because all
    implementations I have seen had length 8, 25 or 27 bytes, in line
    with the SMBIOS specifications. But let's fix it still.

    Signed-off-by: Jean Delvare
    Fixes: a814c3597a6b ("firmware: dmi_scan: Check DMI structure length")
    Reviewed-by: Mika Westerberg
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jean Delvare
     
  • [ Upstream commit 96a598996f6ac518ac79839ecbb17c91af91f4f7 ]

    When responding to a debug trap (breakpoint) in userspace, the
    kernel's trap handler raised SIGTRAP but returned from the trap via a
    code path that ignored pending signals, resulting in an infinite loop
    re-executing the trapping instruction.

    Signed-off-by: Rich Felker
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Rich Felker
     
  • [ Upstream commit e81b5e01c14add8395dfba7130f8829206bb507d ]

    In mvneta_port_up() we enable relevant RX and TX port queues by write
    queues bit map to an appropriate register.

    q_map must be ZERO in the beginning of this process.

    Signed-off-by: Yelena Krivosheev
    Acked-by: Thomas Petazzoni
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Yelena Krivosheev
     
  • [ Upstream commit c769accdf3d8a103940bea2979b65556718567e9 ]

    In some situation vlan packets do not have ethernet headers. One example
    is packets from tun devices. Users can specify vlan protocol in tun_pi
    field instead of IP protocol. When we have a vlan device with reorder_hdr
    disabled on top of the tun device, such packets from tun devices are
    untagged in skb_vlan_untag() and vlan headers will be inserted back in
    vlan_insert_inner_tag().

    vlan_insert_inner_tag() however did not expect packets without ethernet
    headers, so in such a case size argument for memmove() underflowed.

    We don't need to copy headers for packets which do not have preceding
    headers of vlan headers, so skip memmove() in that case.
    Also don't write vlan protocol in skb->data when it does not have enough
    room for it.

    Fixes: cbe7128c4b92 ("vlan: Fix out of order vlan headers with reorder header off")
    Signed-off-by: Toshiaki Makita
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Toshiaki Makita
     
  • [ Upstream commit ae4745730cf8e693d354ccd4dbaf59ea440c09a9 ]

    In some situation vlan packets do not have ethernet headers. One example
    is packets from tun devices. Users can specify vlan protocol in tun_pi
    field instead of IP protocol, and skb_vlan_untag() attempts to untag such
    packets.

    skb_vlan_untag() (more precisely, skb_reorder_vlan_header() called by it)
    however did not expect packets without ethernet headers, so in such a case
    size argument for memmove() underflowed and triggered crash.

    ====
    BUG: unable to handle kernel paging request at ffff8801cccb8000
    IP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
    PGD 9cee067 P4D 9cee067 PUD 1d9401063 PMD 1cccb7063 PTE 2810100028101
    Oops: 000b [#1] SMP KASAN
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 1 PID: 17663 Comm: syz-executor2 Not tainted 4.16.0-rc7+ #368
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:__memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
    RSP: 0018:ffff8801cc046e28 EFLAGS: 00010287
    RAX: ffff8801ccc244c4 RBX: fffffffffffffffe RCX: fffffffffff6c4c2
    RDX: fffffffffffffffe RSI: ffff8801cccb7ffc RDI: ffff8801cccb8000
    RBP: ffff8801cc046e48 R08: ffff8801ccc244be R09: ffffed0039984899
    R10: 0000000000000001 R11: ffffed0039984898 R12: ffff8801ccc244c4
    R13: ffff8801ccc244c0 R14: ffff8801d96b7c06 R15: ffff8801d96b7b40
    FS: 00007febd562d700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffff8801cccb8000 CR3: 00000001ccb2f006 CR4: 00000000001606e0
    DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
    Call Trace:
    memmove include/linux/string.h:360 [inline]
    skb_reorder_vlan_header net/core/skbuff.c:5031 [inline]
    skb_vlan_untag+0x470/0xc40 net/core/skbuff.c:5061
    __netif_receive_skb_core+0x119c/0x3460 net/core/dev.c:4460
    __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4627
    netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4701
    netif_receive_skb+0xae/0x390 net/core/dev.c:4725
    tun_rx_batched.isra.50+0x5ee/0x870 drivers/net/tun.c:1555
    tun_get_user+0x299e/0x3c20 drivers/net/tun.c:1962
    tun_chr_write_iter+0xb9/0x160 drivers/net/tun.c:1990
    call_write_iter include/linux/fs.h:1782 [inline]
    new_sync_write fs/read_write.c:469 [inline]
    __vfs_write+0x684/0x970 fs/read_write.c:482
    vfs_write+0x189/0x510 fs/read_write.c:544
    SYSC_write fs/read_write.c:589 [inline]
    SyS_write+0xef/0x220 fs/read_write.c:581
    do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x42/0xb7
    RIP: 0033:0x454879
    RSP: 002b:00007febd562cc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
    RAX: ffffffffffffffda RBX: 00007febd562d6d4 RCX: 0000000000454879
    RDX: 0000000000000157 RSI: 0000000020000180 RDI: 0000000000000014
    RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
    R13: 00000000000006b0 R14: 00000000006fc120 R15: 0000000000000000
    Code: 90 90 90 90 90 90 90 48 89 f8 48 83 fa 20 0f 82 03 01 00 00 48 39 fe 7d 0f 49 89 f0 49 01 d0 49 39 f8 0f 8f 9f 00 00 00 48 89 d1 a4 c3 48 81 fa a8 02 00 00 72 05 40 38 fe 74 3b 48 83 ea 20
    RIP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP: ffff8801cc046e28
    CR2: ffff8801cccb8000
    ====

    We don't need to copy headers for packets which do not have preceding
    headers of vlan headers, so skip memmove() in that case.

    Fixes: 4bbb3e0e8239 ("net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off")
    Reported-by: Eric Dumazet
    Signed-off-by: Toshiaki Makita
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Toshiaki Makita
     
  • [ Upstream commit 58f101bf87e32753342a6924772c6ebb0fbde24a ]

    Today, driver drops received packets which are indicated as
    invalid checksum by the device. Instead of dropping such packets,
    pass them to the stack with CHECKSUM_NONE indication in skb.

    Signed-off-by: Ariel Elior
    Signed-off-by: Manish Chopra
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Manish Chopra
     
  • [ Upstream commit f03dbb06dc380274e351ca4b1ee1587ed4529e62 ]

    My recent change to netvsc drive in how receive flags are handled
    broke multicast. The Hyper-v/Azure virtual interface there is not a
    multicast filter list, filtering is only all or none. The driver must
    enable all multicast if any multicast address is present.

    Fixes: 009f766ca238 ("hv_netvsc: filter multicast/broadcast")
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Stephen Hemminger
     
  • [ Upstream commit 914b6dfff790544d9b77dfd1723adb3745ec9700 ]

    A crash is observed when kmemleak_scan accesses the object->pointer,
    likely due to the following race.

    TASK A TASK B TASK C
    kmemleak_write
    (with "scan" and
    NOT "scan=on")
    kmemleak_scan()
    create_object
    kmem_cache_alloc fails
    kmemleak_disable
    kmemleak_do_cleanup
    kmemleak_free_enabled = 0
    kfree
    kmemleak_free bails out
    (kmemleak_free_enabled is 0)
    slub frees object->pointer
    update_checksum
    crash - object->pointer
    freed (DEBUG_PAGEALLOC)

    kmemleak_do_cleanup waits for the scan thread to complete, but not for
    direct call to kmemleak_scan via kmemleak_write. So add a wait for
    kmemleak_scan completion before disabling kmemleak_free, and while at it
    fix the comment on stop_scan_thread.

    [vinmenon@codeaurora.org: fix stop_scan_thread comment]
    Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
    Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
    Signed-off-by: Vinayak Menon
    Reviewed-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Vinayak Menon
     
  • [ Upstream commit c7f26ccfb2c31eb1bf810ba13d044fcf583232db ]

    Attempting to hotplug CPUs with CONFIG_VM_EVENT_COUNTERS enabled can
    cause vmstat_update() to report a BUG due to preemption not being
    disabled around smp_processor_id().

    Discovered on Ubiquiti EdgeRouter Pro with Cavium Octeon II processor.

    BUG: using smp_processor_id() in preemptible [00000000] code:
    kworker/1:1/269
    caller is vmstat_update+0x50/0xa0
    CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted
    4.16.0-rc4-Cavium-Octeon-00009-gf83bbd5-dirty #1
    Workqueue: mm_percpu_wq vmstat_update
    Call Trace:
    show_stack+0x94/0x128
    dump_stack+0xa4/0xe0
    check_preemption_disabled+0x118/0x120
    vmstat_update+0x50/0xa0
    process_one_work+0x144/0x348
    worker_thread+0x150/0x4b8
    kthread+0x110/0x140
    ret_from_kernel_thread+0x14/0x1c

    Link: http://lkml.kernel.org/r/1520881552-25659-1-git-send-email-steven.hill@cavium.com
    Signed-off-by: Steven J. Hill
    Reviewed-by: Andrew Morton
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Steven J. Hill
     
  • [ Upstream commit 299815a4fba9f3c7a81434dba0072148f1690608 ]

    This patch fixes commit 5f48f0bd4e36 ("mm, page_owner: skip unnecessary
    stack_trace entries").

    Because if we skip first two entries then logic of checking count value
    as 2 for recursion is broken and code will go in one depth recursion.

    so we need to check only one call of _RET_IP(__set_page_owner) while
    checking for recursion.

    Current Backtrace while checking for recursion:-

    (save_stack) from (__set_page_owner) // (But recursion returns true here)
    (__set_page_owner) from (get_page_from_freelist)
    (get_page_from_freelist) from (__alloc_pages_nodemask)
    (__alloc_pages_nodemask) from (depot_save_stack)
    (depot_save_stack) from (save_stack) // recursion should return true here
    (save_stack) from (__set_page_owner)
    (__set_page_owner) from (get_page_from_freelist)
    (get_page_from_freelist) from (__alloc_pages_nodemask+)
    (__alloc_pages_nodemask) from (depot_save_stack)
    (depot_save_stack) from (save_stack)
    (save_stack) from (__set_page_owner)
    (__set_page_owner) from (get_page_from_freelist)

    Correct Backtrace with fix:

    (save_stack) from (__set_page_owner) // recursion returned true here
    (__set_page_owner) from (get_page_from_freelist)
    (get_page_from_freelist) from (__alloc_pages_nodemask+)
    (__alloc_pages_nodemask) from (depot_save_stack)
    (depot_save_stack) from (save_stack)
    (save_stack) from (__set_page_owner)
    (__set_page_owner) from (get_page_from_freelist)

    Link: http://lkml.kernel.org/r/1521607043-34670-1-git-send-email-maninder1.s@samsung.com
    Fixes: 5f48f0bd4e36 ("mm, page_owner: skip unnecessary stack_trace entries")
    Signed-off-by: Maninder Singh
    Signed-off-by: Vaneet Narang
    Acked-by: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Oscar Salvador
    Cc: Greg Kroah-Hartman
    Cc: Ayush Mittal
    Cc: Prakash Gupta
    Cc: Vinayak Menon
    Cc: Vasyl Gomonovych
    Cc: Amit Sahrawat
    Cc:
    Cc: Vaneet Narang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Maninder Singh
     
  • [ Upstream commit 880cd276dff17ea29e9a8404275c9502b265afa7 ]

    All the root caches are linked into slab_root_caches which was
    introduced by the commit 510ded33e075 ("slab: implement slab_root_caches
    list") but it missed to add the SLAB's kmem_cache.

    While experimenting with opt-in/opt-out kmem accounting, I noticed
    system crashes due to NULL dereference inside cache_from_memcg_idx()
    while deferencing kmem_cache.memcg_params.memcg_caches. The upstream
    clean kernel will not see these crashes but SLAB should be consistent
    with SLUB which does linked its boot caches (kmem_cache_node and
    kmem_cache) into slab_root_caches.

    Link: http://lkml.kernel.org/r/20180319210020.60289-1-shakeelb@google.com
    Fixes: 510ded33e075c ("slab: implement slab_root_caches list")
    Signed-off-by: Shakeel Butt
    Cc: Tejun Heo
    Cc: Vladimir Davydov
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Shakeel Butt
     
  • [ Upstream commit b9fc828debc8ac2bb21b5819a44d2aea456f1c95 ]

    Since commit c5ad119fb6c09b0297446be05bd66602fa564758
    ("net: sched: pfifo_fast use skb_array") driver is exposed
    to an issue where it is hitting NULL skbs while handling TX
    completions. Driver uses mmiowb() to flush the writes to the
    doorbell bar which is a write-combined bar, however on x86
    mmiowb() does not flush the write combined buffer.

    This patch fixes this problem by replacing mmiowb() with wmb()
    after the write combined doorbell write so that writes are
    flushed and synchronized from more than one processor.

    V1->V2:
    -------
    This patch was marked as "superseded" in patchwork.
    (Not really sure for what reason).Resending it as v2.

    Signed-off-by: Ariel Elior
    Signed-off-by: Manish Chopra

    Signed-off-by: David S. Miller

    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Manish Chopra
     
  • [ Upstream commit f8437520704cfd9cc442a99d73ed708a3cdadaf9 ]

    Since d5d332d3f7e8, a couple of links in scripts/dtc/include-prefixes
    are additionally required in order to build device trees with the header
    package.

    Signed-off-by: Jan Kiszka
    Reviewed-by: Riku Voipio
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jan Kiszka
     
  • [ Upstream commit b85ab56c3f81c5a24b5a5213374f549df06430da ]

    llc_conn_send_pdu() pushes the skb into write queue and
    calls llc_conn_send_pdus() to flush them out. However, the
    status of dev_queue_xmit() is not returned to caller,
    in this case, llc_conn_state_process().

    llc_conn_state_process() needs hold the skb no matter
    success or failure, because it still uses it after that,
    therefore we should hold skb before dev_queue_xmit() when
    that skb is the one being processed by llc_conn_state_process().

    For other callers, they can just pass NULL and ignore
    the return value as they are.

    Reported-by: Noam Rathaus
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit bd6271039ee6f0c9b468148fc2d73e0584af6b4f ]

    The following pattern fails to compile while the same pattern
    with alternative_call() does:

    if (...)
    alternative_call_2(...);
    else
    alternative_call_2(...);

    as it expands into

    if (...)
    {
    };
    Signed-off-by: Thomas Gleixner
    Acked-by: Borislav Petkov
    Link: https://lkml.kernel.org/r/20180114120504.GA11368@avx2
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Alexey Dobriyan
     
  • [ Upstream commit 71eb9ee9596d8df3d5723c3cfc18774c6235e8b1 ]

    this patch fix a bug in how the pebs->real_ip is handled in the PEBS
    handler. real_ip only exists in Haswell and later processor. It is
    actually the eventing IP, i.e., where the event occurred. As opposed
    to the pebs->ip which is the PEBS interrupt IP which is always off
    by one.

    The problem is that the real_ip just like the IP needs to be fixed up
    because PEBS does not record all the machine state registers, and
    in particular the code segement (cs). This is why we have the set_linear_ip()
    function. The problem was that set_linear_ip() was only used on the pebs->ip
    and not the pebs->real_ip.

    We have profiles which ran into invalid callstacks because of this.
    Here is an example:

    ..... 0: ffffffffffffff80 recent entry, marker kernel v
    ..... 1: 000000000040044d cs=10 user_mode(regs)=0

    The problem is that the kernel entry in 1: points to a user level
    address. How can that be?

    The reason is that with PEBS sampling the instruction that caused the event
    to occur and the instruction where the CPU was when the interrupt was posted
    may be far apart. And sometime during that time window, the privilege level may
    change. This happens, for instance, when the PEBS sample is taken close to a
    kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level
    instruction. But by the time the PMU interrupt fired, the processor had already
    entered kernel space. This is why the debug output shows a user address with
    user_mode() false.

    The problem comes from PEBS not recording the code segment (cs) register.
    The register is used in x86_64 to determine if executing in kernel vs user
    space. This is okay because the kernel has a software workaround called
    set_linear_ip(). But the issue in setup_pebs_sample_data() is that
    set_linear_ip() is never called on the real_ip value when it is available
    (Haswell and later) and precise_ip > 1.

    This patch fixes this problem and eliminates the callchain discrepancy.

    The patch restructures the code around set_linear_ip() to minimize the number
    of times the IP has to be set.

    Signed-off-by: Stephane Eranian
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: kan.liang@intel.com
    Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Stephane Eranian
     
  • [ Upstream commit f125376b06bcc57dfb0216ac8d6ec6d5dcf81025 ]

    Add dependancy for switchdev to be congfigured as any user-space control
    plane SW is expected to use the HW switchdev ID to locate the representors
    related to VFs of a certain PF and apply SW/offloaded switching on them.

    Fixes: e80541ecabd5 ('net/mlx5: Add CONFIG_MLX5_ESWITCH Kconfig')
    Signed-off-by: Or Gerlitz
    Reviewed-by: Mark Bloch
    Signed-off-by: Saeed Mahameed
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Or Gerlitz
     
  • [ Upstream commit 3c82b372a9f44aa224b8d5106ff6f1ad516fa8a8 ]

    It's required to create a modules.alias via MODULE_DEVICE_TABLE helper
    for the OF platform driver. Otherwise, module autoloading cannot work.

    Signed-off-by: Sean Wang
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Sean Wang
     
  • [ Upstream commit 5c78f6bfae2b10ff70e21d343e64584ea6280c26 ]

    vlan_vids_add_by_dev is called right after dev hwaddr sync, so on
    the err path it should unsync dev hwaddr. Otherwise, the slave
    dev's hwaddr will never be unsync when this err happens.

    Fixes: 1ff412ad7714 ("bonding: change the bond's vlan syncing functions with the standard ones")
    Signed-off-by: Xin Long
    Reviewed-by: Nikolay Aleksandrov
    Acked-by: Andy Gospodarek
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     
  • [ Upstream commit 743989254ea9f132517806d8893ca9b6cf9dc86b ]

    BroadMobi BM806U is an Qualcomm MDM9225 based 3G/4G modem.
    Tested hardware BM806U is mounted on D-Link DWR-921-C3 router.
    The USB id is added to qmi_wwan.c to allow QMI communication with
    the BM806U.

    Tested on 4.14 kernel and OpenWRT.

    Signed-off-by: Pawel Dembicki
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Pawel Dembicki
     
  • [ Upstream commit e69647a19c870c2f919e4d5023af8a515e8ef25f ]

    Description:
    EEE does not work with lan7800 when AutoSpeed is not set.
    (This can happen when EEPROM is not populated or configured incorrectly)

    Root-Cause:
    When EEE is enabled, the mac config register ASD is not set
    i.e. in default state, causing EEE fail.

    Fix:
    Set the register when eeprom is not present.

    Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
    Signed-off-by: Raghuram Chary J
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Raghuram Chary J