19 May, 2017

7 commits


14 May, 2017

5 commits

  • Linus Torvalds
     
  • Pull some more input subsystem updates from Dmitry Torokhov:
    "An updated xpad driver with a few more recognized device IDs, and a
    new psxpad-spi driver, allowing connecting Playstation 1 and 2 joypads
    via SPI bus"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: cros_ec_keyb - remove extraneous 'const'
    Input: add support for PlayStation 1/2 joypads connected via SPI
    Input: xpad - add USB IDs for Mad Catz Brawlstick and Razer Sabertooth
    Input: xpad - sync supported devices with xboxdrv
    Input: xpad - sort supported devices by USB ID

    Linus Torvalds
     
  • Pull UBI/UBIFS updates from Richard Weinberger:

    - new config option CONFIG_UBIFS_FS_SECURITY

    - minor improvements

    - random fixes

    * tag 'upstream-4.12-rc1' of git://git.infradead.org/linux-ubifs:
    ubi: Add debugfs file for tracking PEB state
    ubifs: Fix a typo in comment of ioctl2ubifs & ubifs2ioctl
    ubifs: Remove unnecessary assignment
    ubifs: Fix cut and paste error on sb type comparisons
    ubi: fastmap: Fix slab corruption
    ubifs: Add CONFIG_UBIFS_FS_SECURITY to disable/enable security labels
    ubi: Make mtd parameter readable
    ubi: Fix section mismatch

    Linus Torvalds
     
  • Pull UML fixes from Richard Weinberger:
    "No new stuff, just fixes"

    * 'for-linus-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
    um: Add missing NR_CPUS include
    um: Fix to call read_initrd after init_bootmem
    um: Include kbuild.h instead of duplicating its macros
    um: Fix PTRACE_POKEUSER on x86_64
    um: Set number of CPUs
    um: Fix _print_addr()

    Linus Torvalds
     
  • Merge misc fixes from Andrew Morton:
    "15 fixes"

    * emailed patches from Andrew Morton :
    mm, docs: update memory.stat description with workingset* entries
    mm: vmscan: scan until it finds eligible pages
    mm, thp: copying user pages must schedule on collapse
    dax: fix PMD data corruption when fault races with write
    dax: fix data corruption when fault races with write
    ext4: return to starting transaction in ext4_dax_huge_fault()
    mm: fix data corruption due to stale mmap reads
    dax: prevent invalidation of mapped DAX entries
    Tigran has moved
    mm, vmalloc: fix vmalloc users tracking properly
    mm/khugepaged: add missed tracepoint for collapse_huge_page_swapin
    gcov: support GCC 7.1
    mm, vmstat: Remove spurious WARN() during zoneinfo print
    time: delete current_fs_time()
    hwpoison, memcg: forcibly uncharge LRU pages

    Linus Torvalds
     

13 May, 2017

28 commits

  • Commit 4b4cea91691d ("mm: vmscan: fix IO/refault regression in cache
    workingset transition") introduced three new entries in memory stat
    file:

    - workingset_refault
    - workingset_activate
    - workingset_nodereclaim

    This commit adds a corresponding description to the cgroup v2 docs.

    Link: http://lkml.kernel.org/r/1494530293-31236-1-git-send-email-guro@fb.com
    Signed-off-by: Roman Gushchin
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Vladimir Davydov
    Cc: Tejun Heo
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roman Gushchin
     
  • Although there are a ton of free swap and anonymous LRU page in elgible
    zones, OOM happened.

    balloon invoked oom-killer: gfp_mask=0x17080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO|__GFP_NOTRACK), nodemask=(null), order=0, oom_score_adj=0
    CPU: 7 PID: 1138 Comm: balloon Not tainted 4.11.0-rc6-mm1-zram-00289-ge228d67e9677-dirty #17
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    Call Trace:
    oom_kill_process+0x21d/0x3f0
    out_of_memory+0xd8/0x390
    __alloc_pages_slowpath+0xbc1/0xc50
    __alloc_pages_nodemask+0x1a5/0x1c0
    pte_alloc_one+0x20/0x50
    __pte_alloc+0x1e/0x110
    __handle_mm_fault+0x919/0x960
    handle_mm_fault+0x77/0x120
    __do_page_fault+0x27a/0x550
    trace_do_page_fault+0x43/0x150
    do_async_page_fault+0x2c/0x90
    async_page_fault+0x28/0x30
    Mem-Info:
    active_anon:424716 inactive_anon:65314 isolated_anon:0
    active_file:52 inactive_file:46 isolated_file:0
    unevictable:0 dirty:27 writeback:0 unstable:0
    slab_reclaimable:3967 slab_unreclaimable:4125
    mapped:133 shmem:43 pagetables:1674 bounce:0
    free:4637 free_pcp:225 free_cma:0
    Node 0 active_anon:1698864kB inactive_anon:261256kB active_file:208kB inactive_file:184kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:532kB dirty:108kB writeback:0kB shmem:172kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
    DMA free:7316kB min:32kB low:44kB high:56kB active_anon:8064kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:464kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
    lowmem_reserve[]: 0 992 992 1952
    DMA32 free:9088kB min:2048kB low:3064kB high:4080kB active_anon:952176kB inactive_anon:0kB active_file:36kB inactive_file:0kB unevictable:0kB writepending:88kB present:1032192kB managed:1019388kB mlocked:0kB slab_reclaimable:13532kB slab_unreclaimable:16460kB kernel_stack:3552kB pagetables:6672kB bounce:0kB free_pcp:56kB local_pcp:24kB free_cma:0kB
    lowmem_reserve[]: 0 0 0 959
    Movable free:3644kB min:1980kB low:2960kB high:3940kB active_anon:738560kB inactive_anon:261340kB active_file:188kB inactive_file:640kB unevictable:0kB writepending:20kB present:1048444kB managed:1010816kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:832kB local_pcp:60kB free_cma:0kB
    lowmem_reserve[]: 0 0 0 0
    DMA: 1*4kB (E) 0*8kB 18*16kB (E) 10*32kB (E) 10*64kB (E) 9*128kB (ME) 8*256kB (E) 2*512kB (E) 2*1024kB (E) 0*2048kB 0*4096kB = 7524kB
    DMA32: 417*4kB (UMEH) 181*8kB (UMEH) 68*16kB (UMEH) 48*32kB (UMEH) 14*64kB (MH) 3*128kB (M) 1*256kB (H) 1*512kB (M) 2*1024kB (M) 0*2048kB 0*4096kB = 9836kB
    Movable: 1*4kB (M) 1*8kB (M) 1*16kB (M) 1*32kB (M) 0*64kB 1*128kB (M) 2*256kB (M) 4*512kB (M) 1*1024kB (M) 0*2048kB 0*4096kB = 3772kB
    378 total pagecache pages
    17 pages in swap cache
    Swap cache stats: add 17325, delete 17302, find 0/27
    Free swap = 978940kB
    Total swap = 1048572kB
    524157 pages RAM
    0 pages HighMem/MovableOnly
    12629 pages reserved
    0 pages cma reserved
    0 pages hwpoisoned
    [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
    [ 433] 0 433 4904 5 14 3 82 0 upstart-udev-br
    [ 438] 0 438 12371 5 27 3 191 -1000 systemd-udevd

    With investigation, skipping page of isolate_lru_pages makes reclaim
    void because it returns zero nr_taken easily so LRU shrinking is
    effectively nothing and just increases priority aggressively. Finally,
    OOM happens.

    The problem is that get_scan_count determines nr_to_scan with eligible
    zones so although priority drops to zero, it couldn't reclaim any pages
    if the LRU contains mostly ineligible pages.

    get_scan_count:

    size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
    size = size >> sc->priority;

    Assumes sc->priority is 0 and LRU list is as follows.

    N-N-N-N-H-H-H-H-H-H-H-H-H-H-H-H-H-H-H-H

    (Ie, small eligible pages are in the head of LRU but others are
    almost ineligible pages)

    In that case, size becomes 4 so VM want to scan 4 pages but 4 pages from
    tail of the LRU are not eligible pages. If get_scan_count counts
    skipped pages, it doesn't reclaim any pages remained after scanning 4
    pages so it ends up OOM happening.

    This patch makes isolate_lru_pages try to scan pages until it encounters
    eligible zones's pages.

    [akpm@linux-foundation.org: clean up mind-bending `for' statement. Tweak comment text]
    Fixes: 3db65812d688 ("Revert "mm, vmscan: account for skipped pages as a partial scan"")
    Link: http://lkml.kernel.org/r/1494457232-27401-1-git-send-email-minchan@kernel.org
    Signed-off-by: Minchan Kim
    Acked-by: Michal Hocko
    Acked-by: Johannes Weiner
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • We have encountered need_resched warnings in __collapse_huge_page_copy()
    while doing {clear,copy}_user_highpage() over HPAGE_PMD_NR source pages.

    mm->mmap_sem is held for write, but the iteration is well bounded.

    Reschedule as needed.

    Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1705101426380.109808@chino.kir.corp.google.com
    Signed-off-by: David Rientjes
    Acked-by: Vlastimil Babka
    Cc: "Kirill A. Shutemov"
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • This is based on a patch from Jan Kara that fixed the equivalent race in
    the DAX PTE fault path.

    Currently DAX PMD read fault can race with write(2) in the following
    way:

    CPU1 - write(2) CPU2 - read fault
    dax_iomap_pmd_fault()
    ->iomap_begin() - sees hole

    dax_iomap_rw()
    iomap_apply()
    ->iomap_begin - allocates blocks
    dax_iomap_actor()
    invalidate_inode_pages2_range()
    - there's nothing to invalidate

    grab_mapping_entry()
    - we add huge zero page to the radix tree
    and map it to page tables

    The result is that hole page is mapped into page tables (and thus zeros
    are seen in mmap) while file has data written in that place.

    Fix the problem by locking exception entry before mapping blocks for the
    fault. That way we are sure invalidate_inode_pages2_range() call for
    racing write will either block on entry lock waiting for the fault to
    finish (and unmap stale page tables after that) or read fault will see
    already allocated blocks by write(2).

    Fixes: 9f141d6ef6258 ("dax: Call ->iomap_begin without entry lock during dax fault")
    Link: http://lkml.kernel.org/r/20170510172700.18991-1-ross.zwisler@linux.intel.com
    Signed-off-by: Ross Zwisler
    Reviewed-by: Jan Kara
    Cc: Dan Williams
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     
  • Currently DAX read fault can race with write(2) in the following way:

    CPU1 - write(2) CPU2 - read fault
    dax_iomap_pte_fault()
    ->iomap_begin() - sees hole
    dax_iomap_rw()
    iomap_apply()
    ->iomap_begin - allocates blocks
    dax_iomap_actor()
    invalidate_inode_pages2_range()
    - there's nothing to invalidate
    grab_mapping_entry()
    - we add zero page in the radix tree
    and map it to page tables

    The result is that hole page is mapped into page tables (and thus zeros
    are seen in mmap) while file has data written in that place.

    Fix the problem by locking exception entry before mapping blocks for the
    fault. That way we are sure invalidate_inode_pages2_range() call for
    racing write will either block on entry lock waiting for the fault to
    finish (and unmap stale page tables after that) or read fault will see
    already allocated blocks by write(2).

    Fixes: 9f141d6ef6258a3a37a045842d9ba7e68f368956
    Link: http://lkml.kernel.org/r/20170510085419.27601-5-jack@suse.cz
    Signed-off-by: Jan Kara
    Reviewed-by: Ross Zwisler
    Cc: Dan Williams
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • DAX will return to locking exceptional entry before mapping blocks for a
    page fault to fix possible races with concurrent writes. To avoid lock
    inversion between exceptional entry lock and transaction start, start
    the transaction already in ext4_dax_huge_fault().

    Fixes: 9f141d6ef6258a3a37a045842d9ba7e68f368956
    Link: http://lkml.kernel.org/r/20170510085419.27601-4-jack@suse.cz
    Signed-off-by: Jan Kara
    Cc: Ross Zwisler
    Cc: Dan Williams
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Currently, we didn't invalidate page tables during invalidate_inode_pages2()
    for DAX. That could result in e.g. 2MiB zero page being mapped into
    page tables while there were already underlying blocks allocated and
    thus data seen through mmap were different from data seen by read(2).
    The following sequence reproduces the problem:

    - open an mmap over a 2MiB hole

    - read from a 2MiB hole, faulting in a 2MiB zero page

    - write to the hole with write(3p). The write succeeds but we
    incorrectly leave the 2MiB zero page mapping intact.

    - via the mmap, read the data that was just written. Since the zero
    page mapping is still intact we read back zeroes instead of the new
    data.

    Fix the problem by unconditionally calling invalidate_inode_pages2_range()
    in dax_iomap_actor() for new block allocations and by properly
    invalidating page tables in invalidate_inode_pages2_range() for DAX
    mappings.

    Fixes: c6dcf52c23d2d3fb5235cec42d7dd3f786b87d55
    Link: http://lkml.kernel.org/r/20170510085419.27601-3-jack@suse.cz
    Signed-off-by: Jan Kara
    Signed-off-by: Ross Zwisler
    Cc: Dan Williams
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Patch series "mm,dax: Fix data corruption due to mmap inconsistency",
    v4.

    This series fixes data corruption that can happen for DAX mounts when
    page faults race with write(2) and as a result page tables get out of
    sync with block mappings in the filesystem and thus data seen through
    mmap is different from data seen through read(2).

    The series passes testing with t_mmap_stale test program from Ross and
    also other mmap related tests on DAX filesystem.

    This patch (of 4):

    dax_invalidate_mapping_entry() currently removes DAX exceptional entries
    only if they are clean and unlocked. This is done via:

    invalidate_mapping_pages()
    invalidate_exceptional_entry()
    dax_invalidate_mapping_entry()

    However, for page cache pages removed in invalidate_mapping_pages()
    there is an additional criteria which is that the page must not be
    mapped. This is noted in the comments above invalidate_mapping_pages()
    and is checked in invalidate_inode_page().

    For DAX entries this means that we can can end up in a situation where a
    DAX exceptional entry, either a huge zero page or a regular DAX entry,
    could end up mapped but without an associated radix tree entry. This is
    inconsistent with the rest of the DAX code and with what happens in the
    page cache case.

    We aren't able to unmap the DAX exceptional entry because according to
    its comments invalidate_mapping_pages() isn't allowed to block, and
    unmap_mapping_range() takes a write lock on the mapping->i_mmap_rwsem.

    Since we essentially never have unmapped DAX entries to evict from the
    radix tree, just remove dax_invalidate_mapping_entry().

    Fixes: c6dcf52c23d2 ("mm: Invalidate DAX radix tree entries only if appropriate")
    Link: http://lkml.kernel.org/r/20170510085419.27601-2-jack@suse.cz
    Signed-off-by: Ross Zwisler
    Signed-off-by: Jan Kara
    Reported-by: Jan Kara
    Cc: Dan Williams
    Cc: [4.10+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ross Zwisler
     
  • Cc: Tigran Aivazian
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Commit 1f5307b1e094 ("mm, vmalloc: properly track vmalloc users") has
    pulled asm/pgtable.h include dependency to linux/vmalloc.h and that
    turned out to be a bad idea for some architectures. E.g. m68k fails
    with

    In file included from arch/m68k/include/asm/pgtable_mm.h:145:0,
    from arch/m68k/include/asm/pgtable.h:4,
    from include/linux/vmalloc.h:9,
    from arch/m68k/kernel/module.c:9:
    arch/m68k/include/asm/mcf_pgtable.h: In function 'nocache_page':
    >> arch/m68k/include/asm/mcf_pgtable.h:339:43: error: 'init_mm' undeclared (first use in this function)
    #define pgd_offset_k(address) pgd_offset(&init_mm, address)

    as spotted by kernel build bot. nios2 fails for other reason

    In file included from include/asm-generic/io.h:767:0,
    from arch/nios2/include/asm/io.h:61,
    from include/linux/io.h:25,
    from arch/nios2/include/asm/pgtable.h:18,
    from include/linux/mm.h:70,
    from include/linux/pid_namespace.h:6,
    from include/linux/ptrace.h:9,
    from arch/nios2/include/uapi/asm/elf.h:23,
    from arch/nios2/include/asm/elf.h:22,
    from include/linux/elf.h:4,
    from include/linux/module.h:15,
    from init/main.c:16:
    include/linux/vmalloc.h: In function '__vmalloc_node_flags':
    include/linux/vmalloc.h:99:40: error: 'PAGE_KERNEL' undeclared (first use in this function); did you mean 'GFP_KERNEL'?

    which is due to the newly added #include , which on nios2
    includes and thus and which
    again includes .

    Tweaking that around just turns out a bigger headache than necessary.
    This patch reverts 1f5307b1e094 and reimplements the original fix in a
    different way. __vmalloc_node_flags can stay static inline which will
    cover vmalloc* functions. We only have one external user
    (kvmalloc_node) and we can export __vmalloc_node_flags_caller and
    provide the caller directly. This is much simpler and it doesn't really
    need any games with header files.

    [akpm@linux-foundation.org: coding-style fixes]
    [mhocko@kernel.org: revert old comment]
    Link: http://lkml.kernel.org/r/20170509211054.GB16325@dhcp22.suse.cz
    Fixes: 1f5307b1e094 ("mm, vmalloc: properly track vmalloc users")
    Link: http://lkml.kernel.org/r/20170509153702.GR6481@dhcp22.suse.cz
    Signed-off-by: Michal Hocko
    Cc: Tobias Klauser
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • One return case of `__collapse_huge_page_swapin()` does not invoke
    tracepoint while every other return case does. This commit adds a
    tracepoint invocation for the case.

    Link: http://lkml.kernel.org/r/20170507101813.30187-1-sj38.park@gmail.com
    Signed-off-by: SeongJae Park
    Cc: Kirill A. Shutemov
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    SeongJae Park
     
  • Starting from GCC 7.1, __gcov_exit is a new symbol expected to be
    implemented in a profiling runtime.

    [akpm@linux-foundation.org: coding-style fixes]
    [mliska@suse.cz: v2]
    Link: http://lkml.kernel.org/r/e63a3c59-0149-c97e-4084-20ca8f146b26@suse.cz
    Link: http://lkml.kernel.org/r/8c4084fa-3885-29fe-5fc4-0d4ca199c785@suse.cz
    Signed-off-by: Martin Liska
    Acked-by: Peter Oberparleiter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Liska
     
  • After commit e2ecc8a79ed4 ("mm, vmstat: print non-populated zones in
    zoneinfo"), /proc/zoneinfo will show unpopulated zones.

    A memoryless node, having no populated zones at all, was previously
    ignored, but will now trigger the WARN() in is_zone_first_populated().

    Remove this warning, as its only purpose was to warn of a situation that
    has since been enabled.

    Aside: The "per-node stats" are still printed under the first populated
    zone, but that's not necessarily the first stanza any more. I'm not
    sure which criteria is more important with regard to not breaking
    parsers, but it looks a little weird to the eye.

    Fixes: e2ecc8a79ed4 ("mm, vmstat: print node-based stats in zoneinfo file")
    Link: http://lkml.kernel.org/r/1493854905-10918-1-git-send-email-arbab@linux.vnet.ibm.com
    Signed-off-by: Reza Arbab
    Cc: David Rientjes
    Cc: Anshuman Khandual
    Cc: Vlastimil Babka
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Reza Arbab
     
  • All uses of the current_fs_time() function have been replaced by other
    time interfaces.

    And, its use cases can be fulfilled by current_time() or ktime_get_*
    variants.

    Link: http://lkml.kernel.org/r/1491613030-11599-13-git-send-email-deepa.kernel@gmail.com
    Signed-off-by: Deepa Dinamani
    Reviewed-by: Arnd Bergmann
    Cc: John Stultz
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Deepa Dinamani
     
  • Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
    his particular case he has hit a bad_page("page still charged to
    cgroup") when onlining a hwpoison page. While this looks like something
    that shouldn't happen in the first place because onlining hwpages and
    returning them to the page allocator makes only little sense it shows a
    real problem.

    hwpoison pages do not get freed usually so we do not uncharge them (at
    least not since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge
    API")). Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol:
    take a css reference for each charged page")) as well and so the
    mem_cgroup and the associated state will never go away. Fix this leak
    by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
    We also have to tweak uncharge_list because it cannot rely on zero ref
    count for these pages.

    [akpm@linux-foundation.org: coding-style fixes]
    Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")
    Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz
    Signed-off-by: Michal Hocko
    Reported-by: Laurent Dufour
    Tested-by: Laurent Dufour
    Reviewed-by: Balbir Singh
    Reviewed-by: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • Pull libnvdimm fixes from Dan Williams:
    "Incremental fixes and a small feature addition on top of the main
    libnvdimm 4.12 pull request:

    - Geert noticed that tinyconfig was bloated by BLOCK selecting DAX.
    The size regression is fixed by moving all dax helpers into the
    dax-core and only specifying "select DAX" for FS_DAX and
    dax-capable drivers. He also asked for clarification of the
    NR_DEV_DAX config option which, on closer look, does not need to be
    a config option at all. Mike also throws in a DEV_DAX_PMEM fixup
    for good measure.

    - Ben's attention to detail on -stable patch submissions caught a
    case where the recent fixes to arch_copy_from_iter_pmem() missed a
    condition where we strand dirty data in the cache. This is tagged
    for -stable and will also be included in the rework of the pmem api
    to a proposed {memcpy,copy_user}_flushcache() interface for 4.13.

    - Vishal adds a feature that missed the initial pull due to pending
    review feedback. It allows the kernel to clear media errors when
    initializing a BTT (atomic sector update driver) instance on a pmem
    namespace.

    - Ross noticed that the dax_device + dax_operations conversion broke
    __dax_zero_page_range(). The nvdimm unit tests fail to check this
    path, but xfstests immediately trips over it. No excuse for missing
    this before submitting the 4.12 pull request.

    These all pass the nvdimm unit tests and an xfstests spot check. The
    set has received a build success notification from the kbuild robot"

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    filesystem-dax: fix broken __dax_zero_page_range() conversion
    libnvdimm, btt: ensure that initializing metadata clears poison
    libnvdimm: add an atomic vs process context flag to rw_bytes
    x86, pmem: Fix cache flushing for iovec write < 8 bytes
    device-dax: kill NR_DEV_DAX
    block, dax: move "select DAX" from BLOCK to FS_DAX
    device-dax: Tell kbuild DEV_DAX_PMEM depends on DEV_DAX

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "This contains a one-liner change that has a significant impact:
    disabling the build of OSS. It's been unmaintained for long time, and
    we'd like to drop the stuff. Finally, as the first step, stop the
    build. Let's see whether it works without much complaints.

    Other than that, there are two small fixes for HD-audio"

    * tag 'sound-fix-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    sound: Disable the build of OSS drivers
    ALSA: hda: Fix cpu lockup when stopping the cmd dmas
    ALSA: hda - Add mute led support for HP EliteBook 840 G3

    Linus Torvalds
     
  • Pull more power-supply updates from Sebastian Reichel:
    "The power-supply subsystem has a few more changes for the v4.12 merge
    window:

    - New battery driver for AXP20X and AXP22X PMICs

    - Improve max17042_battery for usage on x86

    - Misc small cleanups & fixes"

    * tag 'for-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (34 commits)
    power: supply: cpcap-charger: Keep trickle charger bits disabled
    power: supply: cpcap-charger: Fix enable for 3.8V charge setting
    power: supply: cpcap-charger: Fix charge voltage configuration
    power: supply: cpcap-charger: Fix charger name
    power: supply: twl4030-charger: make twl4030_bci_property_is_writeable static
    power: supply: sbs-battery: Add alert callback
    mailmap: add Sebastian Reichel
    power: supply: avoid unused twl4030-madc.h
    power: supply: sbs-battery: Correct supply status with current draw
    power: supply: sbs-battery: Don't ignore the first external power change
    power: supply: pda_power: move from timer to delayed_work
    power: supply: max17042_battery: Add support for the SCOPE property
    power: supply: max17042_battery: Add support for the CHARGE_NOW property
    power: supply: max17042_battery: Add support for the CHARGE_FULL_DESIGN property
    power: supply: max17042_battery: mAh readings depend on r_sns value
    power: supply: max17042_battery: Add support for the VOLT_MIN property
    power: supply: max17042_battery: Add support for the TECHNOLOGY attribute
    power: supply: max17042_battery: Add external_power_changed callback
    power: supply: max17042_battery: Add support for the STATUS property
    power: supply: max17042_battery: Add default platform_data fallback data
    ...

    Linus Torvalds
     
  • Pull thermal management updates from Zhang Rui:

    - Fix a problem where orderly_shutdown() is called for multiple times
    due to multiple critical overheating events raised in a short period
    by platform thermal driver. (Keerthy)

    - Introduce a backup thermal shutdown mechanism, which invokes
    kernel_power_off()/emergency_restart() directly, after
    orderly_shutdown() being issued for certain amount of time(specified
    via Kconfig). This is useful in certain conditions that userspace may
    be unable to power off the system in a clean manner and leaves the
    system in a critical state, like in the middle of driver probing
    phase. (Keerthy)

    - Introduce a new interface in thermal devfreq_cooling code so that the
    driver can provide more precise data regarding actual power to the
    thermal governor every time the power budget is calculated. (Lukasz
    Luba)

    - Introduce BCM 2835 soc thermal driver and northstar thermal driver,
    within a new sub-folder. (Rafał Miłecki)

    - Introduce DA9062/61 thermal driver. (Steve Twiss)

    - Remove non-DT booting on TI-SoC driver. Also add support to fetching
    coefficients from DT. (Keerthy)

    - Refactorf RCAR Gen3 thermal driver. (Niklas Söderlund)

    - Small fix on MTK and intel-soc-dts thermal driver. (Dawei Chien,
    Brian Bian)

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (25 commits)
    thermal: core: Add a back up thermal shutdown mechanism
    thermal: core: Allow orderly_poweroff to be called only once
    Thermal: Intel SoC DTS: Change interrupt request behavior
    trace: thermal: add another parameter 'power' to the tracing function
    thermal: devfreq_cooling: add new interface for direct power read
    thermal: devfreq_cooling: refactor code and add get_voltage function
    thermal: mt8173: minor mtk_thermal.c cleanups
    thermal: bcm2835: move to the broadcom subdirectory
    thermal: broadcom: ns: specify myself as MODULE_AUTHOR
    thermal: da9062/61: Thermal junction temperature monitoring driver
    Documentation: devicetree: thermal: da9062/61 TJUNC temperature binding
    thermal: broadcom: add Northstar thermal driver
    dt-bindings: thermal: add support for Broadcom's Northstar thermal
    thermal: bcm2835: add thermal driver for bcm2835 SoC
    dt-bindings: Add thermal zone to bcm2835-thermal example
    thermal: rcar_gen3_thermal: add suspend and resume support
    thermal: rcar_gen3_thermal: store device match data in private structure
    thermal: rcar_gen3_thermal: enable hardware interrupts for trip points
    thermal: rcar_gen3_thermal: record and check number of TSCs found
    thermal: rcar_gen3_thermal: check that TSC exists before memory allocation
    ...

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "AMD, nouveau, one i915, and one EDID fix for v4.12-rc1

    Some fixes that it would be good to have in rc1. It contains the i915
    quiet fix that you reported.

    It also has an amdgpu fixes pull, with lots of ongoing work on Vega10
    which is new in this kernel and is preliminary support so may have a
    fair bit of movement.

    Otherwise a few non-Vega10 AMD fixes, one EDID fix and some nouveau
    regression fixers"

    * tag 'drm-fixes-for-v4.12-rc1' of git://people.freedesktop.org/~airlied/linux: (144 commits)
    drm/i915: Make vblank evade warnings optional
    drm/nouveau/therm: remove ineffective workarounds for alarm bugs
    drm/nouveau/tmr: avoid processing completed alarms when adding a new one
    drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm
    drm/nouveau/tmr: handle races with hw when updating the next alarm time
    drm/nouveau/tmr: ack interrupt before processing alarms
    drm/nouveau/core: fix static checker warning
    drm/nouveau/fb/ram/gf100-: remove 0x10f200 read
    drm/nouveau/kms/nv50: skip core channel cursor update on position-only changes
    drm/nouveau/kms/nv50: fix source-rect-only plane updates
    drm/nouveau/kms/nv50: remove pointless argument to window atomic_check_acquire()
    drm/amd/powerplay: refine pwm1_enable callback functions for CI.
    drm/amd/powerplay: refine pwm1_enable callback functions for vi.
    drm/amd/powerplay: refine pwm1_enable callback functions for Vega10.
    drm/amdgpu: refine amdgpu pwm1_enable sysfs interface.
    drm/amdgpu: add amd fan ctrl mode enums.
    drm/amd/powerplay: add more smu message on Vega10.
    drm/amdgpu: fix dependency issue
    drm/amd: fix init order of sched job
    drm/amdgpu: add some additional vega10 pci ids
    ...

    Linus Torvalds
     
  • Pull SCSI target updates from Nicholas Bellinger:
    "Things were a lot more calm than previously expected. It's primarily
    fixes in various areas, with most of the new functionality centering
    around TCMU backend driver work that Xiubo Li has been driving.

    Here's the summary on the feature side:

    - Make T10-PI verify configurable for emulated (FILEIO + RD) backends
    (Dmitry Monakhov)
    - Allow target-core/TCMU pass-through to use in-kernel SPC-PR logic
    (Bryant Ly + MNC)
    - Add TCMU support for growing ring buffer size (Xiubo Li + MNC)
    - Add TCMU support for global block data pool (Xiubo Li + MNC)

    and on the bug-fix side:

    - Fix COMPARE_AND_WRITE non GOOD status handling for READ phase
    failures (Gary Guo + nab)
    - Fix iscsi-target hang with explicitly changing per NodeACL
    CmdSN number depth with concurrent login driven session
    reinstatement. (Gary Guo + nab)
    - Fix ibmvscsis fabric driver ABORT task handling (Bryant Ly)
    - Fix target-core/FILEIO zero length handling (Bart Van Assche)

    Also, there was an OOPs introduced with the WRITE_VERIFY changes that
    I ended up reverting at the last minute, because as not unusual Bart
    and I could not agree on the fix in time for -rc1. Since it's specific
    to a conformance test, it's been reverted for now.

    There is a separate patch in the queue to address the underlying
    control CDB write overflow regression in >= v4.3 separate from the
    WRITE_VERIFY revert here, that will be pushed post -rc1"

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (30 commits)
    Revert "target: Fix VERIFY and WRITE VERIFY command parsing"
    IB/srpt: Avoid that aborting a command triggers a kernel warning
    IB/srpt: Fix abort handling
    target/fileio: Fix zero-length READ and WRITE handling
    ibmvscsis: Do not send aborted task response
    tcmu: fix module removal due to stuck thread
    target: Don't force session reset if queue_depth does not change
    iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement
    target: Fix compare_and_write_callback handling for non GOOD status
    tcmu: Recalculate the tcmu_cmd size to save cmd area memories
    tcmu: Add global data block pool support
    tcmu: Add dynamic growing data area feature support
    target: fixup error message in target_tg_pt_gp_tg_pt_gp_id_store()
    target: fixup error message in target_tg_pt_gp_alua_access_type_store()
    target/user: PGR Support
    target: Add WRITE_VERIFY_16
    Documentation/target: add an example script to configure an iSCSI target
    target: Use kmalloc_array() in transport_kmap_data_sg()
    target: Use kmalloc_array() in compare_and_write_callback()
    target: Improve size determinations in two functions
    ...

    Linus Torvalds
     
  • Pull misc vfs updates from Al Viro:
    "Making sure that something like a referral point won't end up as pwd
    or root.

    The main part is the last commit (fixing mntns_install()); that one
    fixes a hard-to-hit race. The fchdir() commit is making fchdir(2) a
    bit more robust - it should be impossible to get opened files (even
    O_PATH ones) for referral points in the first place, so the existing
    checks are OK, but checking the same thing as in chdir(2) is just as
    cheap.

    The path_init() commit removes a redundant check that shouldn't have
    been there in the first place"

    * 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    make sure that mntns_install() doesn't end up with referral for root
    path_init(): don't bother with checking MAY_EXEC for LOOKUP_ROOT
    make sure that fchdir() won't accept referral points, etc.

    Linus Torvalds
     
  • Pull perf updates/fixes from Ingo Molnar:
    "Mostly tooling updates, but also two kernel fixes: a call chain
    handling robustness fix and an x86 PMU driver event definition fix"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/callchain: Force USER_DS when invoking perf_callchain_user()
    tools build: Fixup sched_getcpu feature test
    perf tests kmod-path: Don't fail if compressed modules aren't supported
    perf annotate: Fix AArch64 comment char
    perf tools: Fix spelling mistakes
    perf/x86: Fix Broadwell-EP DRAM RAPL events
    perf config: Refactor a duplicated code for obtaining config file name
    perf symbols: Allow user probes on versioned symbols
    perf symbols: Accept symbols starting at address 0
    tools lib string: Adopt prefixcmp() from perf and subcmd
    perf units: Move parse_tag_value() to units.[ch]
    perf ui gtk: Move gtk .so name to the only place where it is used
    perf tools: Move HAS_BOOL define to where perl headers are used
    perf memswap: Split the byteswap memory range wrappers from util.[ch]
    perf tools: Move event prototypes from util.h to event.h
    perf buildid: Move prototypes from util.h to build-id.h

    Linus Torvalds
     
  • Pull timer fix from Ingo Molnar:
    "A single ARM Juno clocksource driver fix"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    clocksource/arm_arch_timer: Fix arch_timer_mem_find_best_frame()

    Linus Torvalds
     
  • Pull stackprotector fixlet from Ingo Molnar:
    "A single fix/enhancement to increase stackprotector canary randomness
    on 64-bit kernels with very little cost"

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms

    Linus Torvalds
     
  • Pull x86 fixes from Ingo Molnar:
    "Misc fixes:

    - two boot crash fixes
    - unwinder fixes
    - kexec related kernel direct mappings enhancements/fixes
    - more Clang support quirks
    - minor cleanups
    - Documentation fixes"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/intel_rdt: Fix a typo in Documentation
    x86/build: Don't add -maccumulate-outgoing-args w/o compiler support
    x86/boot/32: Fix UP boot on Quark and possibly other platforms
    x86/mm/32: Set the '__vmalloc_start_set' flag in initmem_init()
    x86/kexec/64: Use gbpages for identity mappings if available
    x86/mm: Add support for gbpages to kernel_ident_mapping_init()
    x86/boot: Declare error() as noreturn
    x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang incompatibility
    x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds()
    x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic()
    x86/microcode/AMD: Remove redundant NULL check on mc

    Linus Torvalds
     
  • Pull xen fixes from Juergen Gross:
    "This contains two fixes for booting under Xen introduced during this
    merge window and two fixes for older problems, where one is just much
    more probable due to another merge window change"

    * tag 'for-linus-4.12b-rc0c-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen: adjust early dom0 p2m handling to xen hypervisor behavior
    x86/amd: don't set X86_BUG_SYSRET_SS_ATTRS when running under Xen
    xen/x86: Do not call xen_init_time_ops() until shared_info is initialized
    x86/xen: fix xsave capability setting

    Linus Torvalds
     
  • Pull more powerpc updates from Michael Ellerman:
    "The change to the Linux page table geometry was delayed for more
    testing with 16G pages, and there's the new CPU features stuff which
    just needed one more polish before going in. Plus a few changes from
    Scott which came in a bit late. And then various fixes, mostly minor.

    Summary highlights:

    - rework the Linux page table geometry to lower memory usage on
    64-bit Book3S (IBM chips) using the Hash MMU.

    - support for a new device tree binding for discovering CPU features
    on future firmwares.

    - Freescale updates from Scott:
    "Includes a fix for a powerpc/next mm regression on 64e, a fix for
    a kernel hang on 64e when using a debugger inside a relocated
    kernel, a qman fix, and misc qe improvements."

    Thanks to: Christophe Leroy, Gavin Shan, Horia Geantă, LiuHailong,
    Nicholas Piggin, Roy Pledge, Scott Wood, Valentin Longchamp"

    * tag 'powerpc-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/64s: Support new device tree binding for discovering CPU features
    powerpc: Don't print cpu_spec->cpu_name if it's NULL
    of/fdt: introduce of_scan_flat_dt_subnodes and of_get_flat_dt_phandle
    powerpc/64s: Fix unnecessary machine check handler relocation branch
    powerpc/mm/book3s/64: Rework page table geometry for lower memory usage
    powerpc: Fix distclean with Makefile.postlink
    powerpc/64e: Don't place the stack beyond TASK_SIZE
    powerpc/powernv: Block PCI config access on BCM5718 during EEH recovery
    powerpc/8xx: Adding support of IRQ in MPC8xx GPIO
    soc/fsl/qbman: Disable IRQs for deferred QBMan work
    soc/fsl/qe: add EXPORT_SYMBOL for the 2 qe_tdm functions
    soc/fsl/qe: only apply QE_General4 workaround on affected SoCs
    soc/fsl/qe: round brg_freq to 1kHz granularity
    soc/fsl/qe: get rid of immrbar_virt_to_phys()
    net: ethernet: ucc_geth: fix MEM_PART_MURAM mode
    powerpc/64e: Fix hang when debugging programs with relocated kernel

    Linus Torvalds