21 Sep, 2016

1 commit

  • While running a compile on arm64, I hit a memory exposure

    usercopy: kernel memory exposure attempt detected from fffffc0000f3b1a8 (buffer_head) (1 bytes)
    ------------[ cut here ]------------
    kernel BUG at mm/usercopy.c:75!
    Internal error: Oops - BUG: 0 [#1] SMP
    Modules linked in: ip6t_rpfilter ip6t_REJECT
    nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp
    llc ebtable_nat ip6table_security ip6table_raw ip6table_nat
    nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
    iptable_security iptable_raw iptable_nat nf_conntrack_ipv4
    nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle
    ebtable_filter ebtables ip6table_filter ip6_tables vfat fat xgene_edac
    xgene_enet edac_core i2c_xgene_slimpro i2c_core at803x realtek xgene_dma
    mdio_xgene gpio_dwapb gpio_xgene_sb xgene_rng mailbox_xgene_slimpro nfsd
    auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sdhci_of_arasan
    sdhci_pltfm sdhci mmc_core xhci_plat_hcd gpio_keys
    CPU: 0 PID: 19744 Comm: updatedb Tainted: G W 4.8.0-rc3-threadinfo+ #1
    Hardware name: AppliedMicro X-Gene Mustang Board/X-Gene Mustang Board, BIOS 3.06.12 Aug 12 2016
    task: fffffe03df944c00 task.stack: fffffe00d128c000
    PC is at __check_object_size+0x70/0x3f0
    LR is at __check_object_size+0x70/0x3f0
    ...
    [] __check_object_size+0x70/0x3f0
    [] filldir64+0x158/0x1a0
    [] __fat_readdir+0x4a0/0x558 [fat]
    [] fat_readdir+0x34/0x40 [fat]
    [] iterate_dir+0x190/0x1e0
    [] SyS_getdents64+0x88/0x120
    [] el0_svc_naked+0x24/0x28

    fffffc0000f3b1a8 is a module address. Modules may have compiled in
    strings which could get copied to userspace. In this instance, it
    looks like "." which matches with a size of 1 byte. Extend the
    is_vmalloc_addr check to be is_vmalloc_or_module_addr to cover
    all possible cases.

    Signed-off-by: Laura Abbott
    Signed-off-by: Kees Cook

    Laura Abbott
     

20 Sep, 2016

6 commits

  • During cgroup2 rollout into production, we started encountering css
    refcount underflows and css access crashes in the memory controller.
    Splitting the heavily shared css reference counter into logical users
    narrowed the imbalance down to the cgroup2 socket memory accounting.

    The problem turns out to be the per-cpu charge cache. Cgroup1 had a
    separate socket counter, but the new cgroup2 socket accounting goes
    through the common charge path that uses a shared per-cpu cache for all
    memory that is being tracked. Those caches are safe against scheduling
    preemption, but not against interrupts - such as the newly added packet
    receive path. When cache draining is interrupted by network RX taking
    pages out of the cache, the resuming drain operation will put references
    of in-use pages, thus causing the imbalance.

    Disable IRQs during all per-cpu charge cache operations.

    Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified hierarchy memory controller")
    Link: http://lkml.kernel.org/r/20160914194846.11153-1-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Acked-by: Tejun Heo
    Cc: "David S. Miller"
    Cc: Michal Hocko
    Cc: Vladimir Davydov
    Cc: [4.5+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Commit 62c230bc1790 ("mm: add support for a filesystem to activate
    swap files and use direct_IO for writing swap pages") replaced the
    swap_aops dirty hook from __set_page_dirty_no_writeback() with
    swap_set_page_dirty().

    For normal cases without these special SWP flags code path falls back to
    __set_page_dirty_no_writeback() so the behaviour is expected to be the
    same as before.

    But swap_set_page_dirty() makes use of the page_swap_info() helper to
    get the swap_info_struct to check for the flags like SWP_FILE,
    SWP_BLKDEV etc as desired for those features. This helper has
    BUG_ON(!PageSwapCache(page)) which is racy and safe only for the
    set_page_dirty_lock() path.

    For the set_page_dirty() path which is often needed for cases to be
    called from irq context, kswapd() can toggle the flag behind the back
    while the call is getting executed when system is low on memory and
    heavy swapping is ongoing.

    This ends up with undesired kernel panic.

    This patch just moves the check outside the helper to its users
    appropriately to fix kernel panic for the described path. Couple of
    users of helpers already take care of SwapCache condition so I skipped
    them.

    Link: http://lkml.kernel.org/r/1473460718-31013-1-git-send-email-santosh.shilimkar@oracle.com
    Signed-off-by: Santosh Shilimkar
    Cc: Mel Gorman
    Cc: Joe Perches
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: David S. Miller
    Cc: Jens Axboe
    Cc: Michal Hocko
    Cc: Hugh Dickins
    Cc: Al Viro
    Cc: [4.7.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Santosh Shilimkar
     
  • dump_page() uses page_mapcount() to get mapcount of the page.
    page_mapcount() has VM_BUG_ON_PAGE(PageSlab(page)) as mapcount doesn't
    make sense for slab pages and the field in struct page used for other
    information.

    It leads to recursion if dump_page() called for slub page and DEBUG_VM
    is enabled:

    dump_page() -> page_mapcount() -> VM_BUG_ON_PAGE() -> dump_page -> ...

    Let's avoid calling page_mapcount() for slab pages in dump_page().

    Link: http://lkml.kernel.org/r/20160908082137.131076-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Currently, khugepaged does not permit swapin if there are enough young
    pages in a THP. The problem is when a THP does not have enough young
    pages, khugepaged leaks mapped ptes.

    This patch prohibits leaking mapped ptes.

    Link: http://lkml.kernel.org/r/1472820276-7831-1-git-send-email-ebru.akagunduz@gmail.com
    Signed-off-by: Ebru Akagunduz
    Suggested-by: Andrea Arcangeli
    Reviewed-by: Andrea Arcangeli
    Reviewed-by: Rik van Riel
    Cc: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Kirill A. Shutemov
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ebru Akagunduz
     
  • hugepage_vma_revalidate() tries to re-check if we still should try to
    collapse small pages into huge one after the re-acquiring mmap_sem.

    The problem Dmitry Vyukov reported[1] is that the vma found by
    hugepage_vma_revalidate() can be suitable for huge pages, but not the
    same vma we had before dropping mmap_sem. And dereferencing original
    vma can lead to fun results..

    Let's use vma hugepage_vma_revalidate() found instead of assuming it's the
    same as what we had before the lock was dropped.

    [1] http://lkml.kernel.org/r/CACT4Y+Z3gigBvhca9kRJFcjX0G70V_nRhbwKBU+yGoESBDKi9Q@mail.gmail.com

    Link: http://lkml.kernel.org/r/20160907122559.GA6542@black.fi.intel.com
    Signed-off-by: Kirill A. Shutemov
    Reported-by: Dmitry Vyukov
    Reviewed-by: Andrea Arcangeli
    Cc: Ebru Akagunduz
    Cc: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Vegard Nossum
    Cc: Sasha Levin
    Cc: Konstantin Khlebnikov
    Cc: Andrey Ryabinin
    Cc: Greg Thelen
    Cc: Suleiman Souhlal
    Cc: Hugh Dickins
    Cc: David Rientjes
    Cc: syzkaller
    Cc: Kostya Serebryany
    Cc: Alexander Potapenko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Commit 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest
    neighbor node when mem-offline") introduced new_node_page() for memory
    hotplug.

    In new_node_page(), the nid is cleared before calling
    __alloc_pages_nodemask(). But if it is the only node of the system, and
    the first round allocation fails, it will not be able to get memory from
    an empty nodemask, and will trigger oom.

    The patch checks whether it is the last node on the system, and if it
    is, then don't clear the nid in the nodemask.

    Fixes: 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest neighbor node when mem-offline")
    Link: http://lkml.kernel.org/r/1473044391.4250.19.camel@TP420
    Signed-off-by: Li Zhong
    Reported-by: John Allen
    Acked-by: Vlastimil Babka
    Cc: Xishi Qiu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zhong
     

11 Sep, 2016

1 commit

  • Pull libnvdimm fixes from Dan Williams:
    "nvdimm fixes for v4.8, two of them are tagged for -stable:

    - Fix devm_memremap_pages() to use track_pfn_insert(). Otherwise,
    DAX pmd mappings end up with an uncached pgprot, and unusable
    performance for the device-dax interface. The device-dax interface
    appeared in 4.7 so this is tagged for -stable.

    - Fix a couple VM_BUG_ON() checks in the show_smaps() path to
    understand DAX pmd entries. This fix is tagged for -stable.

    - Fix a mis-merge of the nfit machine-check handler to flip the
    polarity of an if() to match the final version of the patch that
    Vishal sent for 4.8-rc1. Without this the nfit machine check
    handler never detects / inserts new 'badblocks' entries which
    applications use to identify lost portions of files.

    - For test purposes, fix the nvdimm_clear_poison() path to operate on
    legacy / simulated nvdimm memory ranges. Without this fix a test
    can set badblocks, but never clear them on these ranges.

    - Fix the range checking done by dax_dev_pmd_fault(). This is not
    tagged for -stable since this problem is mitigated by specifying
    aligned resources at device-dax setup time.

    These patches have appeared in a next release over the past week. The
    recent rebase you can see in the timestamps was to drop an invalid fix
    as identified by the updated device-dax unit tests [1]. The -mm
    touches have an ack from Andrew"

    [1]: "[ndctl PATCH 0/3] device-dax test for recent kernel bugs"
    https://lists.01.org/pipermail/linux-nvdimm/2016-September/006855.html

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    libnvdimm: allow legacy (e820) pmem region to clear bad blocks
    nfit, mce: Fix SPA matching logic in MCE handler
    mm: fix cache mode of dax pmd mappings
    mm: fix show_smap() for zone_device-pmd ranges
    dax: fix mapping size check

    Linus Torvalds
     

10 Sep, 2016

1 commit

  • Attempting to dump /proc//smaps for a process with pmd dax mappings
    currently results in the following VM_BUG_ONs:

    kernel BUG at mm/huge_memory.c:1105!
    task: ffff88045f16b140 task.stack: ffff88045be14000
    RIP: 0010:[] [] follow_trans_huge_pmd+0x2cb/0x340
    [..]
    Call Trace:
    [] smaps_pte_range+0xa0/0x4b0
    [] ? vsnprintf+0x255/0x4c0
    [] __walk_page_range+0x1fe/0x4d0
    [] walk_page_vma+0x62/0x80
    [] show_smap+0xa6/0x2b0

    kernel BUG at fs/proc/task_mmu.c:585!
    RIP: 0010:[] [] smaps_pte_range+0x499/0x4b0
    Call Trace:
    [] ? vsnprintf+0x255/0x4c0
    [] __walk_page_range+0x1fe/0x4d0
    [] walk_page_vma+0x62/0x80
    [] show_smap+0xa6/0x2b0

    These locations are sanity checking page flags that must be set for an
    anonymous transparent huge page, but are not set for the zone_device
    pages associated with dax mappings.

    Cc: Ross Zwisler
    Cc: Kirill A. Shutemov
    Acked-by: Andrew Morton
    Signed-off-by: Dan Williams

    Dan Williams
     

08 Sep, 2016

1 commit

  • A custom allocator without __GFP_COMP that copies to userspace has been
    found in vmw_execbuf_process[1], so this disables the page-span checker
    by placing it behind a CONFIG for future work where such things can be
    tracked down later.

    [1] https://bugzilla.redhat.com/show_bug.cgi?id=1373326

    Reported-by: Vinson Lee
    Fixes: f5509cc18daa ("mm: Hardened usercopy")
    Signed-off-by: Kees Cook

    Kees Cook
     

02 Sep, 2016

3 commits

  • KASAN allocates memory from the page allocator as part of
    kmem_cache_free(), and that can reference current->mempolicy through any
    number of allocation functions. It needs to be NULL'd out before the
    final reference is dropped to prevent a use-after-free bug:

    BUG: KASAN: use-after-free in alloc_pages_current+0x363/0x370 at addr ffff88010b48102c
    CPU: 0 PID: 15425 Comm: trinity-c2 Not tainted 4.8.0-rc2+ #140
    ...
    Call Trace:
    dump_stack
    kasan_object_err
    kasan_report_error
    __asan_report_load2_noabort
    alloc_pages_current mempolicy to NULL before dropping the final
    reference.

    Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1608301442180.63329@chino.kir.corp.google.com
    Fixes: cd11016e5f52 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
    Signed-off-by: David Rientjes
    Reported-by: Vegard Nossum
    Acked-by: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Cc: [4.6+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Firmware Assisted Dump (FA_DUMP) on ppc64 reserves substantial amounts
    of memory when booting a secondary kernel. Srikar Dronamraju reported
    that multiple nodes may have no memory managed by the buddy allocator
    but still return true for populated_zone().

    Commit 1d82de618ddd ("mm, vmscan: make kswapd reclaim in terms of
    nodes") was reported to cause kswapd to spin at 100% CPU usage when
    fadump was enabled. The old code happened to deal with the situation of
    a populated node with zero free pages by co-incidence but the current
    code tries to reclaim populated zones without realising that is
    impossible.

    We cannot just convert populated_zone() as many existing users really
    need to check for present_pages. This patch introduces a managed_zone()
    helper and uses it in the few cases where it is critical that the check
    is made for managed pages -- zonelist construction and page reclaim.

    Link: http://lkml.kernel.org/r/20160831195104.GB8119@techsingularity.net
    Signed-off-by: Mel Gorman
    Reported-by: Srikar Dronamraju
    Tested-by: Srikar Dronamraju
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • There have been several reports about pre-mature OOM killer invocation
    in 4.7 kernel when order-2 allocation request (for the kernel stack)
    invoked OOM killer even during basic workloads (light IO or even kernel
    compile on some filesystems). In all reported cases the memory is
    fragmented and there are no order-2+ pages available. There is usually
    a large amount of slab memory (usually dentries/inodes) and further
    debugging has shown that there are way too many unmovable blocks which
    are skipped during the compaction. Multiple reporters have confirmed
    that the current linux-next which includes [1] and [2] helped and OOMs
    are not reproducible anymore.

    A simpler fix for the late rc and stable is to simply ignore the
    compaction feedback and retry as long as there is a reclaim progress and
    we are not getting OOM for order-0 pages. We already do that for
    CONFING_COMPACTION=n so let's reuse the same code when compaction is
    enabled as well.

    [1] http://lkml.kernel.org/r/20160810091226.6709-1-vbabka@suse.cz
    [2] http://lkml.kernel.org/r/f7a9ea9d-bb88-bfd6-e340-3a933559305a@suse.cz

    Fixes: 0a0337e0d1d1 ("mm, oom: rework oom detection")
    Link: http://lkml.kernel.org/r/20160823074339.GB23577@dhcp22.suse.cz
    Signed-off-by: Michal Hocko
    Tested-by: Olaf Hering
    Tested-by: Ralf-Peter Rohbeck
    Cc: Markus Trippelsdorf
    Cc: Arkadiusz Miskiewicz
    Cc: Ralf-Peter Rohbeck
    Cc: Jiri Slaby
    Cc: Vlastimil Babka
    Cc: Joonsoo Kim
    Cc: Tetsuo Handa
    Cc: David Rientjes
    Cc: [4.7.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

27 Aug, 2016

4 commits

  • For DAX inodes we need to be careful to never have page cache pages in
    the mapping->page_tree. This radix tree should be composed only of DAX
    exceptional entries and zero pages.

    ltp's readahead02 test was triggering a warning because we were trying
    to insert a DAX exceptional entry but found that a page cache page had
    already been inserted into the tree. This page was being inserted into
    the radix tree in response to a readahead(2) call.

    Readahead doesn't make sense for DAX inodes, but we don't want it to
    report a failure either. Instead, we just return success and don't do
    any work.

    Link: http://lkml.kernel.org/r/20160824221429.21158-1-ross.zwisler@linux.intel.com
    Signed-off-by: Ross Zwisler
    Reported-by: Jeff Moyer
    Cc: Dan Williams
    Cc: Dave Chinner
    Cc: Dave Hansen
    Cc: Jan Kara
    Cc: [4.5+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     
  • A bugfix in v4.8-rc2 introduced a harmless warning when
    CONFIG_MEMCG_SWAP is disabled but CONFIG_MEMCG is enabled:

    mm/memcontrol.c:4085:27: error: 'mem_cgroup_id_get_online' defined but not used [-Werror=unused-function]
    static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)

    This moves the function inside of the #ifdef block that hides the
    calling function, to avoid the warning.

    Fixes: 1f47b61fb407 ("mm: memcontrol: fix swap counter leak on swapout from offline cgroup")
    Link: http://lkml.kernel.org/r/20160824113733.2776701-1-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Acked-by: Michal Hocko
    Acked-by: Vladimir Davydov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • The current wording of the COMPACTION Kconfig help text doesn't
    emphasise that disabling COMPACTION might cripple the page allocator
    which relies on the compaction quite heavily for high order requests and
    an unexpected OOM can happen with the lack of compaction. Make sure we
    are vocal about that.

    Link: http://lkml.kernel.org/r/20160823091726.GK23577@dhcp22.suse.cz
    Signed-off-by: Michal Hocko
    Cc: Markus Trippelsdorf
    Cc: Mel Gorman
    Cc: Joonsoo Kim
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • While adding proper userfaultfd_wp support with bits in pagetable and
    swap entry to avoid false positives WP userfaults through swap/fork/
    KSM/etc, I've been adding a framework that mostly mirrors soft dirty.

    So I noticed in one place I had to add uffd_wp support to the pagetables
    that wasn't covered by soft_dirty and I think it should have.

    Example: in the THP migration code migrate_misplaced_transhuge_page()
    pmd_mkdirty is called unconditionally after mk_huge_pmd.

    entry = mk_huge_pmd(new_page, vma->vm_page_prot);
    entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);

    That sets soft dirty too (it's a false positive for soft dirty, the soft
    dirty bit could be more finegrained and transfer the bit like uffd_wp
    will do.. pmd/pte_uffd_wp() enforces the invariant that when it's set
    pmd/pte_write is not set).

    However in the THP split there's no unconditional pmd_mkdirty after
    mk_huge_pmd and pte_swp_mksoft_dirty isn't called after the migration
    entry is created. The code sets the dirty bit in the struct page
    instead of setting it in the pagetable (which is fully equivalent as far
    as the real dirty bit is concerned, as the whole point of pagetable bits
    is to be eventually flushed out of to the page, but that is not
    equivalent for the soft-dirty bit that gets lost in translation).

    This was found by code review only and totally untested as I'm working
    to actually replace soft dirty and I don't have time to test potential
    soft dirty bugfixes as well :).

    Transfer the soft_dirty from pmd to pte during THP splits.

    This fix avoids losing the soft_dirty bit and avoids userland memory
    corruption in the checkpoint.

    Fixes: eef1b3ba053aa6 ("thp: implement split_huge_pmd()")
    Link: http://lkml.kernel.org/r/1471610515-30229-2-git-send-email-aarcange@redhat.com
    Signed-off-by: Andrea Arcangeli
    Acked-by: Pavel Emelyanov
    Cc: "Kirill A. Shutemov"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

23 Aug, 2016

2 commits

  • When running with a local patch which moves the '_stext' symbol to the
    very beginning of the kernel text area, I got the following panic with
    CONFIG_HARDENED_USERCOPY:

    usercopy: kernel memory exposure attempt detected from ffff88103dfff000 () (4096 bytes)
    ------------[ cut here ]------------
    kernel BUG at mm/usercopy.c:79!
    invalid opcode: 0000 [#1] SMP
    ...
    CPU: 0 PID: 4800 Comm: cp Not tainted 4.8.0-rc3.after+ #1
    Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.5.4 01/22/2016
    task: ffff880817444140 task.stack: ffff880816274000
    RIP: 0010:[] __check_object_size+0x76/0x413
    RSP: 0018:ffff880816277c40 EFLAGS: 00010246
    RAX: 000000000000006b RBX: ffff88103dfff000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffff88081f80dfa8 RDI: ffff88081f80dfa8
    RBP: ffff880816277c90 R08: 000000000000054c R09: 0000000000000000
    R10: 0000000000000005 R11: 0000000000000006 R12: 0000000000001000
    R13: ffff88103e000000 R14: ffff88103dffffff R15: 0000000000000001
    FS: 00007fb9d1750800(0000) GS:ffff88081f800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000021d2000 CR3: 000000081a08f000 CR4: 00000000001406f0
    Stack:
    ffff880816277cc8 0000000000010000 000000043de07000 0000000000000000
    0000000000001000 ffff880816277e60 0000000000001000 ffff880816277e28
    000000000000c000 0000000000001000 ffff880816277ce8 ffffffff8136c3a6
    Call Trace:
    [] copy_page_to_iter_iovec+0xa6/0x1c0
    [] copy_page_to_iter+0x16/0x90
    [] generic_file_read_iter+0x3e3/0x7c0
    [] ? xfs_file_buffered_aio_write+0xad/0x260 [xfs]
    [] ? down_read+0x12/0x40
    [] xfs_file_buffered_aio_read+0x51/0xc0 [xfs]
    [] xfs_file_read_iter+0x62/0xb0 [xfs]
    [] __vfs_read+0xdf/0x130
    [] vfs_read+0x8e/0x140
    [] SyS_read+0x55/0xc0
    [] do_syscall_64+0x67/0x160
    [] entry_SYSCALL64_slow_path+0x25/0x25
    RIP: 0033:[] 0x7fb9d0c33c00
    RSP: 002b:00007ffc9c262f28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
    RAX: ffffffffffffffda RBX: fffffffffff8ffff RCX: 00007fb9d0c33c00
    RDX: 0000000000010000 RSI: 00000000021c3000 RDI: 0000000000000004
    RBP: 00000000021c3000 R08: 0000000000000000 R09: 00007ffc9c264d6c
    R10: 00007ffc9c262c50 R11: 0000000000000246 R12: 0000000000010000
    R13: 00007ffc9c2630b0 R14: 0000000000000004 R15: 0000000000010000
    Code: 81 48 0f 44 d0 48 c7 c6 90 4d a3 81 48 c7 c0 bb b3 a2 81 48 0f 44 f0 4d 89 e1 48 89 d9 48 c7 c7 68 16 a3 81 31 c0 e8 f4 57 f7 ff 0b 48 8d 90 00 40 00 00 48 39 d3 0f 83 22 01 00 00 48 39 c3
    RIP [] __check_object_size+0x76/0x413
    RSP

    The checked object's range [ffff88103dfff000, ffff88103e000000) is
    valid, so there shouldn't have been a BUG. The hardened usercopy code
    got confused because the range's ending address is the same as the
    kernel's text starting address at 0xffff88103e000000. The overlap check
    is slightly off.

    Fixes: f5509cc18daa ("mm: Hardened usercopy")
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Kees Cook

    Josh Poimboeuf
     
  • check_bogus_address() checked for pointer overflow using this expression,
    where 'ptr' has type 'const void *':

    ptr + n < ptr

    Since pointer wraparound is undefined behavior, gcc at -O2 by default
    treats it like the following, which would not behave as intended:

    (long)n < 0

    Fortunately, this doesn't currently happen for kernel code because kernel
    code is compiled with -fno-strict-overflow. But the expression should be
    fixed anyway to use well-defined integer arithmetic, since it could be
    treated differently by different compilers in the future or could be
    reported by tools checking for undefined behavior.

    Signed-off-by: Eric Biggers
    Signed-off-by: Kees Cook

    Eric Biggers
     

12 Aug, 2016

7 commits

  • The following oops occurs after a pgdat is hotadded:

    Unable to handle kernel paging request for data at address 0x00c30001
    Faulting instruction address: 0xc00000000022f8f4
    Oops: Kernel access of bad area, sig: 11 [#1]
    SMP NR_CPUS=2048 NUMA pSeries
    Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter nls_utf8 isofs sg virtio_balloon uio_pdrv_genirq uio ip_tables xfs libcrc32c sr_mod cdrom sd_mod virtio_net ibmvscsi scsi_transport_srp virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod
    CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 4.8.0-rc1-device #110
    task: c000000000ef3080 task.stack: c000000000f6c000
    NIP: c00000000022f8f4 LR: c00000000022f948 CTR: 0000000000000000
    REGS: c000000000f6fa50 TRAP: 0300 Tainted: G W (4.8.0-rc1-device)
    MSR: 800000010280b033 CR: 84002028 XER: 20000000
    CFAR: d000000001d2013c DAR: 0000000000c30001 DSISR: 40000000 SOFTE: 0
    NIP refresh_cpu_vm_stats+0x1a4/0x2f0
    LR refresh_cpu_vm_stats+0x1f8/0x2f0
    Call Trace:
    refresh_cpu_vm_stats+0x1f8/0x2f0 (unreliable)

    Add per_cpu_nodestats initialization to the hotplug codepath.

    Link: http://lkml.kernel.org/r/1470931473-7090-1-git-send-email-arbab@linux.vnet.ibm.com
    Signed-off-by: Reza Arbab
    Cc: Mel Gorman
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Reza Arbab
     
  • mm/oom_kill.c: In function `task_will_free_mem':
    mm/oom_kill.c:767: warning: `ret' may be used uninitialized in this function

    If __task_will_free_mem() is never called inside the for_each_process()
    loop, ret will not be initialized.

    Fixes: 1af8bb43269563e4 ("mm, oom: fortify task_will_free_mem()")
    Link: http://lkml.kernel.org/r/1470255599-24841-1-git-send-email-geert@linux-m68k.org
    Signed-off-by: Geert Uytterhoeven
    Acked-by: Tetsuo Handa
    Acked-by: Michal Hocko
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • It's quite unlikely that the user will so little memory that the per-CPU
    quarantines won't fit into the given fraction of the available memory.
    Even in that case he won't be able to do anything with the information
    given in the warning.

    Link: http://lkml.kernel.org/r/1470929182-101413-1-git-send-email-glider@google.com
    Signed-off-by: Alexander Potapenko
    Acked-by: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Andrey Konovalov
    Cc: Christoph Lameter
    Cc: Joonsoo Kim
    Cc: Kuthonuzo Luruo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • Since commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure
    after many small jobs") swap entries do not pin memcg->css.refcnt
    directly. Instead, they pin memcg->id.ref. So we should adjust the
    reference counters accordingly when moving swap charges between cgroups.

    Fixes: 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
    Link: http://lkml.kernel.org/r/9ce297c64954a42dc90b543bc76106c4a94f07e8.1470219853.git.vdavydov@virtuozzo.com
    Signed-off-by: Vladimir Davydov
    Acked-by: Michal Hocko
    Acked-by: Johannes Weiner
    Cc: [3.19+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • An offline memory cgroup might have anonymous memory or shmem left
    charged to it and no swap. Since only swap entries pin the id of an
    offline cgroup, such a cgroup will have no id and so an attempt to
    swapout its anon/shmem will not store memory cgroup info in the swap
    cgroup map. As a result, memcg->swap or memcg->memsw will never get
    uncharged from it and any of its ascendants.

    Fix this by always charging swapout to the first ancestor cgroup that
    hasn't released its id yet.

    [hannes@cmpxchg.org: add comment to mem_cgroup_swapout]
    [vdavydov@virtuozzo.com: use WARN_ON_ONCE() in mem_cgroup_id_get_online()]
    Link: http://lkml.kernel.org/r/20160803123445.GJ13263@esperanza
    Fixes: 73f576c04b941 ("mm: memcontrol: fix cgroup creation failure after many small jobs")
    Link: http://lkml.kernel.org/r/5336daa5c9a32e776067773d9da655d2dc126491.1470219853.git.vdavydov@virtuozzo.com
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: [3.19+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • meminfo_proc_show() and si_mem_available() are using the wrong helpers
    for calculating the size of the LRUs. The user-visible impact is that
    there appears to be an abnormally high number of unevictable pages.

    Link: http://lkml.kernel.org/r/20160805105805.GR2799@techsingularity.net
    Signed-off-by: Mel Gorman
    Cc: Dave Chinner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • When memory hotplug operates, free hugepages will be freed if the
    movable node is offline. Therefore, /proc/sys/vm/nr_hugepages will be
    incorrect.

    Fix it by reducing max_huge_pages when the node is offlined.

    n-horiguchi@ah.jp.nec.com said:

    : dissolve_free_huge_page intends to break a hugepage into buddy, and the
    : destination hugepage is supposed to be allocated from the pool of the
    : destination node, so the system-wide pool size is reduced. So adding
    : h->max_huge_pages-- makes sense to me.

    Link: http://lkml.kernel.org/r/1470624546-902-1-git-send-email-zhongjiang@huawei.com
    Signed-off-by: zhong jiang
    Cc: Mike Kravetz
    Acked-by: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    zhong jiang
     

11 Aug, 2016

6 commits

  • With debugobjects enabled and using SLAB_DESTROY_BY_RCU, when a
    kmem_cache_node is destroyed the call_rcu() may trigger a slab
    allocation to fill the debug object pool (__debug_object_init:fill_pool).

    Everywhere but during kmem_cache_destroy(), discard_slab() is performed
    outside of the kmem_cache_node->list_lock and avoids a lockdep warning
    about potential recursion:

    =============================================
    [ INFO: possible recursive locking detected ]
    4.8.0-rc1-gfxbench+ #1 Tainted: G U
    ---------------------------------------------
    rmmod/8895 is trying to acquire lock:
    (&(&n->list_lock)->rlock){-.-...}, at: [] get_partial_node.isra.63+0x47/0x430

    but task is already holding lock:
    (&(&n->list_lock)->rlock){-.-...}, at: [] __kmem_cache_shutdown+0x54/0x320

    other info that might help us debug this:
    Possible unsafe locking scenario:
    CPU0
    ----
    lock(&(&n->list_lock)->rlock);
    lock(&(&n->list_lock)->rlock);

    *** DEADLOCK ***
    May be due to missing lock nesting notation
    5 locks held by rmmod/8895:
    #0: (&dev->mutex){......}, at: driver_detach+0x42/0xc0
    #1: (&dev->mutex){......}, at: driver_detach+0x50/0xc0
    #2: (cpu_hotplug.dep_map){++++++}, at: get_online_cpus+0x2d/0x80
    #3: (slab_mutex){+.+.+.}, at: kmem_cache_destroy+0x3c/0x220
    #4: (&(&n->list_lock)->rlock){-.-...}, at: __kmem_cache_shutdown+0x54/0x320

    stack backtrace:
    CPU: 6 PID: 8895 Comm: rmmod Tainted: G U 4.8.0-rc1-gfxbench+ #1
    Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F11 08/18/2015
    Call Trace:
    __lock_acquire+0x1646/0x1ad0
    lock_acquire+0xb2/0x200
    _raw_spin_lock+0x36/0x50
    get_partial_node.isra.63+0x47/0x430
    ___slab_alloc.constprop.67+0x1a7/0x3b0
    __slab_alloc.isra.64.constprop.66+0x43/0x80
    kmem_cache_alloc+0x236/0x2d0
    __debug_object_init+0x2de/0x400
    debug_object_activate+0x109/0x1e0
    __call_rcu.constprop.63+0x32/0x2f0
    call_rcu+0x12/0x20
    discard_slab+0x3d/0x40
    __kmem_cache_shutdown+0xdb/0x320
    shutdown_cache+0x19/0x60
    kmem_cache_destroy+0x1ae/0x220
    i915_gem_load_cleanup+0x14/0x40 [i915]
    i915_driver_unload+0x151/0x180 [i915]
    i915_pci_remove+0x14/0x20 [i915]
    pci_device_remove+0x34/0xb0
    __device_release_driver+0x95/0x140
    driver_detach+0xb6/0xc0
    bus_remove_driver+0x53/0xd0
    driver_unregister+0x27/0x50
    pci_unregister_driver+0x25/0x70
    i915_exit+0x1a/0x1e2 [i915]
    SyS_delete_module+0x193/0x1f0
    entry_SYSCALL_64_fastpath+0x1c/0xac

    Fixes: 52b4b950b507 ("mm: slab: free kmem_cache_node after destroy sysfs file")
    Link: http://lkml.kernel.org/r/1470759070-18743-1-git-send-email-chris@chris-wilson.co.uk
    Reported-by: Dave Gordon
    Signed-off-by: Chris Wilson
    Reviewed-by: Vladimir Davydov
    Acked-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Dmitry Safonov
    Cc: Daniel Vetter
    Cc: Dave Gordon
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wilson
     
  • In page_remove_file_rmap(.) we have the following check:

    VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page);

    This is meant to check for either HugeTLB pages or THP when a compound
    page is passed in.

    Unfortunately, if one disables CONFIG_TRANSPARENT_HUGEPAGE, then
    PageTransHuge(.) will always return false, provoking BUGs when one runs
    the libhugetlbfs test suite.

    This patch replaces PageTransHuge(), with PageHead() which will work for
    both HugeTLB and THP.

    Fixes: dd78fedde4b9 ("rmap: support file thp")
    Link: http://lkml.kernel.org/r/1470838217-5889-1-git-send-email-steve.capper@arm.com
    Signed-off-by: Steve Capper
    Acked-by: Kirill A. Shutemov
    Cc: Huang Shijie
    Cc: Will Deacon
    Cc: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steve Capper
     
  • PageTransCompound() doesn't distinguish THP from from any other type of
    compound pages. This can lead to false-positive VM_BUG_ON() in
    page_add_file_rmap() if called on compound page from a driver[1].

    I think we can exclude such cases by checking if the page belong to a
    mapping.

    The VM_BUG_ON_PAGE() is downgraded to VM_WARN_ON_ONCE(). This path
    should not cause any harm to non-THP page, but good to know if we step
    on anything else.

    [1] http://lkml.kernel.org/r/c711e067-0bff-a6cb-3c37-04dfe77d2db1@redhat.com

    Link: http://lkml.kernel.org/r/20160810161345.GA67522@black.fi.intel.com
    Signed-off-by: Kirill A. Shutemov
    Reported-by: Laura Abbott
    Tested-by: Laura Abbott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Some of node threshold depends on number of managed pages in the node.
    When memory is going on/offline, it can be changed and we need to adjust
    them.

    Add recalculation to appropriate places and clean-up related functions
    for better maintenance.

    Link: http://lkml.kernel.org/r/1470724248-26780-2-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Mel Gorman
    Cc: Vlastimil Babka
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Before resetting min_unmapped_pages, we need to initialize
    min_unmapped_pages rather than min_slab_pages.

    Fixes: a5f5f91da6 (mm: convert zone_reclaim to node_reclaim)
    Link: http://lkml.kernel.org/r/1470724248-26780-1-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Acked-by: Mel Gorman
    Cc: Vlastimil Babka
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • The newly introduced shmem_huge_enabled() function has two definitions,
    but neither of them is visible if CONFIG_SYSFS is disabled, leading to a
    build error:

    mm/khugepaged.o: In function `khugepaged':
    khugepaged.c:(.text.khugepaged+0x3ca): undefined reference to `shmem_huge_enabled'

    This changes the #ifdef guards around the definition to match those that
    are used in the header file.

    Fixes: e496cf3d7821 ("thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE")
    Link: http://lkml.kernel.org/r/20160809123638.1357593-1-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     

10 Aug, 2016

1 commit

  • To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg,
    which sets page->_mapcount to -512. Currently, we set/clear PageKmemcg
    in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated
    with __GFP_ACCOUNT, including those that aren't actually charged to any
    cgroup, i.e. allocated from the root cgroup context. To avoid overhead
    in case cgroups are not used, we only do that if memcg_kmem_enabled() is
    true. The latter is set iff there are kmem-enabled memory cgroups
    (online or offline). The root cgroup is not considered kmem-enabled.

    As a result, if a page is allocated with __GFP_ACCOUNT for the root
    cgroup when there are kmem-enabled memory cgroups and is freed after all
    kmem-enabled memory cgroups were removed, e.g.

    # no memory cgroups has been created yet, create one
    mkdir /sys/fs/cgroup/memory/test
    # run something allocating pages with __GFP_ACCOUNT, e.g.
    # a program using pipe
    dmesg | tail
    # remove the memory cgroup
    rmdir /sys/fs/cgroup/memory/test

    we'll get bad page state bug complaining about page->_mapcount != -1:

    BUG: Bad page state in process swapper/0 pfn:1fd945c
    page:ffffea007f651700 count:0 mapcount:-511 mapping: (null) index:0x0
    flags: 0x1000000000000000()

    To avoid that, let's mark with PageKmemcg only those pages that are
    actually charged to and hence pin a non-root memory cgroup.

    Fixes: 4949148ad433 ("mm: charge/uncharge kmemcg from generic page allocator paths")
    Reported-and-tested-by: Eric Dumazet
    Signed-off-by: Vladimir Davydov
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

09 Aug, 2016

1 commit

  • Pull usercopy protection from Kees Cook:
    "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
    copy_from_user bounds checking for most architectures on SLAB and
    SLUB"

    * tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    mm: SLUB hardened usercopy support
    mm: SLAB hardened usercopy support
    s390/uaccess: Enable hardened usercopy
    sparc/uaccess: Enable hardened usercopy
    powerpc/uaccess: Enable hardened usercopy
    ia64/uaccess: Enable hardened usercopy
    arm64/uaccess: Enable hardened usercopy
    ARM: uaccess: Enable hardened usercopy
    x86/uaccess: Enable hardened usercopy
    mm: Hardened usercopy
    mm: Implement stack frame object validation
    mm: Add is_migrate_cma_page

    Linus Torvalds
     

08 Aug, 2016

2 commits


06 Aug, 2016

1 commit

  • Pull block fixes from Jens Axboe:
    "Here's the second round of block updates for this merge window.

    It's a mix of fixes for changes that went in previously in this round,
    and fixes in general. This pull request contains:

    - Fixes for loop from Christoph

    - A bdi vs gendisk lifetime fix from Dan, worth two cookies.

    - A blk-mq timeout fix, when on frozen queues. From Gabriel.

    - Writeback fix from Jan, ensuring that __writeback_single_inode()
    does the right thing.

    - Fix for bio->bi_rw usage in f2fs from me.

    - Error path deadlock fix in blk-mq sysfs registration from me.

    - Floppy O_ACCMODE fix from Jiri.

    - Fix to the new bio op methods from Mike.

    One more followup will be coming here, ensuring that we don't
    propagate the block types outside of block. That, and a rename of
    bio->bi_rw is coming right after -rc1 is cut.

    - Various little fixes"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    mm/block: convert rw_page users to bio op use
    loop: make do_req_filebacked more robust
    loop: don't try to use AIO for discards
    blk-mq: fix deadlock in blk_mq_register_disk() error path
    Include: blkdev: Removed duplicate 'struct request;' declaration.
    Fixup direct bi_rw modifiers
    block: fix bdi vs gendisk lifetime mismatch
    blk-mq: Allow timeouts to run while queue is freezing
    nbd: fix race in ioctl
    block: fix use-after-free in seq file
    f2fs: drop bio->bi_rw manual assignment
    block: add missing group association in bio-cloning functions
    blkcg: kill unused field nr_undestroyed_grps
    writeback: Write dirty times for WB_SYNC_ALL writeback
    floppy: fix open(O_ACCMODE) for ioctl-only open

    Linus Torvalds
     

05 Aug, 2016

3 commits

  • Pull more powerpc updates from Michael Ellerman:
    "These were delayed for various reasons, so I let them sit in next a
    bit longer, rather than including them in my first pull request.

    Fixes:
    - Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
    - Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
    - Move register_process_table() out of ppc_md from Michael Ellerman

    Use jump_label use for [cpu|mmu]_has_feature():
    - Add mmu_early_init_devtree() from Michael Ellerman
    - Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
    - Do hash device tree scanning earlier from Michael Ellerman
    - Do radix device tree scanning earlier from Michael Ellerman
    - Do feature patching before MMU init from Michael Ellerman
    - Check features don't change after patching from Michael Ellerman
    - Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
    - Convert mmu_has_feature() to returning bool from Michael Ellerman
    - Convert cpu_has_feature() to returning bool from Michael Ellerman
    - Define radix_enabled() in one place & use static inline from Michael Ellerman
    - Add early_[cpu|mmu]_has_feature() from Michael Ellerman
    - Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
    - jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
    - Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
    - Remove mfvtb() from Kevin Hao
    - Move cpu_has_feature() to a separate file from Kevin Hao
    - Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
    - Add option to use jump label for cpu_has_feature() from Kevin Hao
    - Add option to use jump label for mmu_has_feature() from Kevin Hao
    - Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
    - Annotate jump label assembly from Michael Ellerman

    TLB flush enhancements from Aneesh Kumar K.V:
    - radix: Implement tlb mmu gather flush efficiently
    - Add helper for finding SLBE LLP encoding
    - Use hugetlb flush functions
    - Drop multiple definition of mm_is_core_local
    - radix: Add tlb flush of THP ptes
    - radix: Rename function and drop unused arg
    - radix/hugetlb: Add helper for finding page size
    - hugetlb: Add flush_hugetlb_tlb_range
    - remove flush_tlb_page_nohash

    Add new ptrace regsets from Anshuman Khandual and Simon Guo:
    - elf: Add powerpc specific core note sections
    - Add the function flush_tmregs_to_thread
    - Enable in transaction NT_PRFPREG ptrace requests
    - Enable in transaction NT_PPC_VMX ptrace requests
    - Enable in transaction NT_PPC_VSX ptrace requests
    - Adapt gpr32_get, gpr32_set functions for transaction
    - Enable support for NT_PPC_CGPR
    - Enable support for NT_PPC_CFPR
    - Enable support for NT_PPC_CVMX
    - Enable support for NT_PPC_CVSX
    - Enable support for TM SPR state
    - Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
    - Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
    - Enable support for EBB registers
    - Enable support for Performance Monitor registers"

    * tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
    powerpc/mm: Move register_process_table() out of ppc_md
    powerpc/perf: Fix incorrect event codes in power9-event-list
    powerpc/32: Fix early access to cpu_spec relocation
    powerpc/ptrace: Enable support for Performance Monitor registers
    powerpc/ptrace: Enable support for EBB registers
    powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
    powerpc/ptrace: Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
    powerpc/ptrace: Enable support for TM SPR state
    powerpc/ptrace: Enable support for NT_PPC_CVSX
    powerpc/ptrace: Enable support for NT_PPC_CVMX
    powerpc/ptrace: Enable support for NT_PPC_CFPR
    powerpc/ptrace: Enable support for NT_PPC_CGPR
    powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction
    powerpc/ptrace: Enable in transaction NT_PPC_VSX ptrace requests
    powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests
    powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests
    powerpc/process: Add the function flush_tmregs_to_thread
    elf: Add powerpc specific core note sections
    powerpc/mm: remove flush_tlb_page_nohash
    powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
    ...

    Linus Torvalds
     
  • It causes NULL dereference error and failure to get type_a->regions[0]
    info if parameter type_b of __next_mem_range_rev() == NULL

    Fix this by checking before dereferring and initializing idx_b to 0

    The approach is tested by dumping all types of region via
    __memblock_dump_all() and __next_mem_range_rev() fixed to UART
    separately the result is okay after checking the logs.

    Link: http://lkml.kernel.org/r/57A0320D.6070102@zoho.com
    Signed-off-by: zijun_hu
    Tested-by: zijun_hu
    Acked-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    zijun_hu
     
  • With m68k-linux-gnu-gcc-4.1:

    include/linux/slub_def.h:126: warning: `fixup_red_left' declared inline after being called
    include/linux/slub_def.h:126: warning: previous declaration of `fixup_red_left' was here

    Commit c146a2b98eb5 ("mm, kasan: account for object redzone in SLUB's
    nearest_obj()") made fixup_red_left() global, but forgot to remove the
    inline keyword.

    Fixes: c146a2b98eb5898e ("mm, kasan: account for object redzone in SLUB's nearest_obj()")
    Link: http://lkml.kernel.org/r/1470256262-1586-1-git-send-email-geert@linux-m68k.org
    Signed-off-by: Geert Uytterhoeven
    Cc: Alexander Potapenko
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven