16 Aug, 2016

2 commits

  • rtree_next_node() walks the linked list of leaf nodes to find the next
    block of pages in the struct memory_bitmap. If it walks off the end of
    the list of nodes, it walks the list of memory zones to find the next
    region of memory. If it walks off the end of the list of zones, it
    returns false.

    This leaves the struct bm_position's node and zone pointers pointing
    at their respective struct list_heads in struct mem_zone_bm_rtree.

    memory_bm_find_bit() uses struct bm_position's node and zone pointers
    to avoid walking lists and trees if the next bit appears in the same
    node/zone. It handles these values being stale.

    Swap rtree_next_node()s 'step then test' to 'test-next then step',
    this means if we reach the end of memory we return false and leave
    the node and zone pointers as they were.

    This fixes a panic on resume using AMD Seattle with 64K pages:
    [ 6.868732] Freezing user space processes ... (elapsed 0.000 seconds) done.
    [ 6.875753] Double checking all user space processes after OOM killer disable... (elapsed 0.000 seconds)
    [ 6.896453] PM: Using 3 thread(s) for decompression.
    [ 6.896453] PM: Loading and decompressing image data (5339 pages)...
    [ 7.318890] PM: Image loading progress: 0%
    [ 7.323395] Unable to handle kernel paging request at virtual address 00800040
    [ 7.330611] pgd = ffff000008df0000
    [ 7.334003] [00800040] *pgd=00000083fffe0003, *pud=00000083fffe0003, *pmd=00000083fffd0003, *pte=0000000000000000
    [ 7.344266] Internal error: Oops: 96000005 [#1] PREEMPT SMP
    [ 7.349825] Modules linked in:
    [ 7.352871] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G W I 4.8.0-rc1 #4737
    [ 7.360512] Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1002C 04/08/2016
    [ 7.369109] task: ffff8003c0220000 task.stack: ffff8003c0280000
    [ 7.375020] PC is at set_bit+0x18/0x30
    [ 7.378758] LR is at memory_bm_set_bit+0x24/0x30
    [ 7.383362] pc : [] lr : [] pstate: 60000045
    [ 7.390743] sp : ffff8003c0283b00
    [ 7.473551]
    [ 7.475031] Process swapper/0 (pid: 1, stack limit = 0xffff8003c0280020)
    [ 7.481718] Stack: (0xffff8003c0283b00 to 0xffff8003c0284000)
    [ 7.800075] Call trace:
    [ 7.887097] [] set_bit+0x18/0x30
    [ 7.891876] [] duplicate_memory_bitmap.constprop.38+0x54/0x70
    [ 7.899172] [] snapshot_write_next+0x22c/0x47c
    [ 7.905166] [] load_image_lzo+0x754/0xa88
    [ 7.910725] [] swsusp_read+0x144/0x230
    [ 7.916025] [] load_image_and_restore+0x58/0x90
    [ 7.922105] [] software_resume+0x2f0/0x338
    [ 7.927752] [] do_one_initcall+0x38/0x11c
    [ 7.933314] [] kernel_init_freeable+0x14c/0x1ec
    [ 7.939395] [] kernel_init+0x10/0xfc
    [ 7.944520] [] ret_from_fork+0x10/0x40
    [ 7.949820] Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22)
    [ 7.955909] ---[ end trace 0024a5986e6ff323 ]---
    [ 7.960529] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

    Here struct mem_zone_bm_rtree's start_pfn has been returned instead of
    struct rtree_node's addr as the node/zone pointers are corrupt after
    we walked off the end of the lists during mark_unsafe_pages().

    This behaviour was exposed by commit 6dbecfd345a6 ("PM / hibernate:
    Simplify mark_unsafe_pages()"), which caused mark_unsafe_pages() to call
    duplicate_memory_bitmap(), which uses memory_bm_find_bit() after walking
    off the end of the memory bitmap.

    Fixes: 3a20cb177961 (PM / Hibernate: Implement position keeping in radix tree)
    Signed-off-by: James Morse
    [ rjw: Subject ]
    Signed-off-by: Rafael J. Wysocki

    James Morse
     
  • The value of temp_level4_pgt is the physical address of the
    top-level page directory, so use __pa() to compute it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Ingo Molnar

    Rafael J. Wysocki
     

13 Aug, 2016

2 commits

  • Update some documentation related to system sleep to document new
    features and remove outdated information from it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Reviewed-by: Chen Yu

    Rafael J. Wysocki
     
  • Restore the processor state before calling any other functions to
    ensure per-CPU variables can be used with KASLR memory randomization.

    Tracing functions use per-CPU variables (GS based on x86) and one was
    called just before restoring the processor state fully. It resulted
    in a double fault when both the tracing & the exception handler
    functions tried to use a per-CPU variable.

    Fixes: bb3632c6101b (PM / sleep: trace events for suspend/resume)
    Reported-and-tested-by: Borislav Petkov
    Reported-by: Jiri Kosina
    Tested-by: Rafael J. Wysocki
    Tested-by: Jiri Kosina
    Signed-off-by: Thomas Garnier
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki

    Thomas Garnier
     

09 Aug, 2016

1 commit

  • The low-level resume-from-hibernation code on x86-64 uses
    kernel_ident_mapping_init() to create the temoprary identity mapping,
    but that function assumes that the offset between kernel virtual
    addresses and physical addresses is aligned on the PGD level.

    However, with a randomized identity mapping base, it may be aligned
    on the PUD level and if that happens, the temporary identity mapping
    created by set_up_temporary_mappings() will not reflect the actual
    kernel identity mapping and the image restoration will fail as a
    result (leading to a kernel panic most of the time).

    To fix this problem, rework kernel_ident_mapping_init() to support
    unaligned offsets between KVA and PA up to the PMD level and make
    set_up_temporary_mappings() use it as approprtiate.

    Reported-and-tested-by: Thomas Garnier
    Reported-by: Borislav Petkov
    Suggested-by: Yinghai Lu
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Yinghai Lu

    Rafael J. Wysocki
     

03 Aug, 2016

1 commit

  • When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes
    a variable and using it as a symbol in the image memory restoration
    assembly code under core_restore_code is not correct any more.

    To avoid that problem, modify set_up_temporary_mappings() to compute
    the physical address of the temporary page tables and store it in
    temp_level4_pgt, so that the value of that variable is ready to be
    written into CR3. Then, the assembly code doesn't have to worry
    about converting that value into a physical address and things work
    regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set.

    Reported-and-tested-by: Thomas Garnier
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

29 Jul, 2016

1 commit

  • In kernel bug 150021, a kernel panic was reported when restoring a
    hibernate image. Only a picture of the oops was reported, so I can't
    paste the whole thing here. But here are the most interesting parts:

    kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
    BUG: unable to handle kernel paging request at ffff8804615cfd78
    ...
    RIP: ffff8804615cfd78
    RSP: ffff8804615f0000
    RBP: ffff8804615cfdc0
    ...
    Call Trace:
    do_signal+0x23
    exit_to_usermode_loop+0x64
    ...

    The RIP is on the same page as RBP, so it apparently started executing
    on the stack.

    The bug was bisected to commit ef0f3ed5a4ac (x86/asm/power: Create
    stack frames in hibernate_asm_64.S), which in retrospect seems quite
    dangerous, since that code saves and restores the stack pointer from a
    global variable ('saved_context').

    There are a lot of moving parts in the hibernate save and restore paths,
    so I don't know exactly what caused the panic. Presumably, a FRAME_END
    was executed without the corresponding FRAME_BEGIN, or vice versa. That
    would corrupt the return address on the stack and would be consistent
    with the details of the above panic.

    [ rjw: One major problem is that by the time the FRAME_BEGIN in
    restore_registers() is executed, the stack pointer value may not
    be valid any more. Namely, the stack area pointed to by it
    previously may have been overwritten by some image memory contents
    and that page frame may now be used for whatever different purpose
    it had been allocated for before hibernation. In that case, the
    FRAME_BEGIN will corrupt that memory. ]

    Instead of doing the frame pointer save/restore around the bounds of the
    affected functions, just do it around the call to swsusp_save().

    That has the same effect of ensuring that if swsusp_save() sleeps, the
    frame pointers will be correct. It's also a much more obviously safe
    way to do it than the original patch. And objtool still doesn't report
    any warnings.

    Fixes: ef0f3ed5a4ac (x86/asm/power: Create stack frames in hibernate_asm_64.S)
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021
    Cc: 4.6+ # 4.6+
    Reported-by: Andre Reinke
    Tested-by: Andre Reinke
    Signed-off-by: Josh Poimboeuf
    Acked-by: Ingo Molnar
    Signed-off-by: Rafael J. Wysocki

    Josh Poimboeuf
     

22 Jul, 2016

1 commit

  • test_resume mode is to verify if the snapshot data
    written to swap device can be successfully restored
    to memory. It is useful to ease the debugging process
    on hibernation, since this mode can not only bypass
    the BIOSes/bootloader, but also the system re-initialization.

    To avoid the risk to break the filesystm on persistent storage,
    this patch resumes the image with tasks frozen.

    For example:
    echo test_resume > /sys/power/disk
    echo disk > /sys/power/state

    [ 187.306470] PM: Image saving progress: 70%
    [ 187.395298] PM: Image saving progress: 80%
    [ 187.476697] PM: Image saving progress: 90%
    [ 187.554641] PM: Image saving done.
    [ 187.558896] PM: Wrote 594600 kbytes in 0.90 seconds (660.66 MB/s)
    [ 187.566000] PM: S|
    [ 187.589742] PM: Basic memory bitmaps freed
    [ 187.594694] PM: Checking hibernation image
    [ 187.599865] PM: Image signature found, resuming
    [ 187.605209] PM: Loading hibernation image.
    [ 187.665753] PM: Basic memory bitmaps created
    [ 187.691397] PM: Using 3 thread(s) for decompression.
    [ 187.691397] PM: Loading and decompressing image data (148650 pages)...
    [ 187.889719] PM: Image loading progress: 0%
    [ 188.100452] PM: Image loading progress: 10%
    [ 188.244781] PM: Image loading progress: 20%
    [ 189.057305] PM: Image loading done.
    [ 189.068793] PM: Image successfully loaded

    Suggested-by: Rafael J. Wysocki
    Signed-off-by: Chen Yu
    Signed-off-by: Rafael J. Wysocki

    Chen Yu
     

16 Jul, 2016

1 commit

  • On Intel hardware, native_play_dead() uses mwait_play_dead() by
    default and only falls back to the other methods if that fails.
    That also happens during resume from hibernation, when the restore
    (boot) kernel runs disable_nonboot_cpus() to take all of the CPUs
    except for the boot one offline.

    However, that is problematic, because the address passed to
    __monitor() in mwait_play_dead() is likely to be written to in the
    last phase of hibernate image restoration and that causes the "dead"
    CPU to start executing instructions again. Unfortunately, the page
    containing the address in that CPU's instruction pointer may not be
    valid any more at that point.

    First, that page may have been overwritten with image kernel memory
    contents already, so the instructions the CPU attempts to execute may
    simply be invalid. Second, the page tables previously used by that
    CPU may have been overwritten by image kernel memory contents, so the
    address in its instruction pointer is impossible to resolve then.

    A report from Varun Koyyalagunta and investigation carried out by
    Chen Yu show that the latter sometimes happens in practice.

    To prevent it from happening, temporarily change the smp_ops.play_dead
    pointer during resume from hibernation so that it points to a special
    "play dead" routine which uses hlt_play_dead() and avoids the
    inadvertent "revivals" of "dead" CPUs this way.

    A slightly unpleasant consequence of this change is that if the
    system is hibernated with one or more CPUs offline, it will generally
    draw more power after resume than it did before hibernation, because
    the physical state entered by CPUs via hlt_play_dead() is higher-power
    than the mwait_play_dead() one in the majority of cases. It is
    possible to work around this, but it is unclear how much of a problem
    that's going to be in practice, so the workaround will be implemented
    later if it turns out to be necessary.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371
    Reported-by: Varun Koyyalagunta
    Original-by: Chen Yu
    Tested-by: Chen Yu
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Ingo Molnar

    Rafael J. Wysocki
     

10 Jul, 2016

5 commits


09 Jul, 2016

1 commit


02 Jul, 2016

4 commits

  • One of the memory bitmaps used by the hibernation image restoration
    code is freed after the image has been loaded.

    That is not quite efficient, though, because the memory pages used
    for building that bitmap are known to be safe (ie. they were not
    used by the image kernel before hibernation) and the arch-specific
    code finalizing the image restoration may need them. In that case
    it needs to allocate those pages again via the memory management
    subsystem, check if they are really safe again by consulting the
    other bitmaps and so on.

    To avoid that, recycle those pages by putting them into the global
    list of known safe pages so that they can be given to the arch code
    right away when necessary.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • Rework mark_unsafe_pages() to use a simpler method of clearing
    all bits in free_pages_map and to set the bits for the "unsafe"
    pages (ie. pages that were used by the image kernel before
    hibernation) with the help of duplicate_memory_bitmap().

    For this purpose, move the pfn_valid() check from mark_unsafe_pages()
    to unpack_orig_pfns() where the "unsafe" pages are discovered.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • The core image restoration code preallocates some safe pages
    (ie. pages that weren't used by the image kernel before hibernation)
    for future use before allocating the bulk of memory for loading the
    image data. Those safe pages are then freed so they can be allocated
    again (with the memory management subsystem's help). That's done to
    ensure that there will be enough safe pages for temporary data
    structures needed during image restoration.

    However, it is not really necessary to free those pages after they
    have been allocated. They can be added to the (global) list of
    safe pages right away and then picked up from there when needed
    without freeing.

    That reduces the overhead related to using safe pages, especially
    in the arch-specific code, so modify the code accordingly.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • If freezable workqueue aborts suspend flow, show
    workqueue state for debug purpose.

    Signed-off-by: Roger Lu
    Acked-by: Tejun Heo
    Signed-off-by: Rafael J. Wysocki

    Roger Lu
     

01 Jul, 2016

1 commit

  • Logan Gunthorpe reports that hibernation stopped working reliably for
    him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table
    and rodata).

    That turns out to be a consequence of a long-standing issue with the
    64-bit image restoration code on x86, which is that the temporary
    page tables set up by it to avoid page tables corruption when the
    last bits of the image kernel's memory contents are copied into
    their original page frames re-use the boot kernel's text mapping,
    but that mapping may very well get corrupted just like any other
    part of the page tables. Of course, if that happens, the final
    jump to the image kernel's entry point will go to nowhere.

    The exact reason why commit ab76f7b4ab23 matters here is that it
    sometimes causes a PMD of a large page to be split into PTEs
    that are allocated dynamically and get corrupted during image
    restoration as described above.

    To fix that issue note that the code copying the last bits of the
    image kernel's memory contents to the page frames occupied by them
    previoulsy doesn't use the kernel text mapping, because it runs from
    a special page covered by the identity mapping set up for that code
    from scratch. Hence, the kernel text mapping is only needed before
    that code starts to run and then it will only be used just for the
    final jump to the image kernel's entry point.

    Accordingly, the temporary page tables set up in swsusp_arch_resume()
    on x86-64 need to contain the kernel text mapping too. That mapping
    is only going to be used for the final jump to the image kernel, so
    it only needs to cover the image kernel's entry point, because the
    first thing the image kernel does after getting control back is to
    switch over to its own original page tables. Moreover, the virtual
    address of the image kernel's entry point in that mapping has to be
    the same as the one mapped by the image kernel's page tables.

    With that in mind, modify the x86-64's arch_hibernation_header_save()
    and arch_hibernation_header_restore() routines to pass the physical
    address of the image kernel's entry point (in addition to its virtual
    address) to the boot kernel (a small piece of assembly code involved
    in passing the entry point's virtual address to the image kernel is
    not necessary any more after that, so drop it). Update RESTORE_MAGIC
    too to reflect the image header format change.

    Next, in set_up_temporary_mappings(), use the physical and virtual
    addresses of the image kernel's entry point passed in the image
    header to set up a minimum kernel text mapping (using memory pages
    that won't be overwritten by the image kernel's memory contents) that
    will map those addresses to each other as appropriate.

    This makes the concern about the possible corruption of the original
    boot kernel text mapping go away and if the the minimum kernel text
    mapping used for the final jump marks the image kernel's entry point
    memory as executable, the jump to it is guaraneed to succeed.

    Fixes: ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table and rodata)
    Link: http://marc.info/?l=linux-pm&m=146372852823760&w=2
    Reported-by: Logan Gunthorpe
    Reported-and-tested-by: Borislav Petkov
    Tested-by: Kees Cook
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     

28 Jun, 2016

1 commit


27 Jun, 2016

2 commits

  • Linus Torvalds
     
  • Pull SCSI fixes from James Bottomley:
    "Two straightforward fixes.

    One is a concurrency issue only affecting SAS connected SATA drives,
    but which could hang the storage subsystem if it triggers (because the
    outstanding command count on error never goes back to zero) and the
    other is a NO_TAG fallout from the switch to hostwide tags which
    causes the system to crash on module insertion (we've checked
    carefully and only the 53c700 family of drivers is vulnerable to this
    issue)"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    53c700: fix BUG on untagged commands
    scsi: fix race between simultaneous decrements of ->host_failed

    Linus Torvalds
     

25 Jun, 2016

17 commits

  • …git/mason/linux-btrfs

    Pull btrfs fixes part 2 from Chris Mason:
    "This has one patch from Omar to bring iterate_shared back to btrfs.

    We have a tree of work we queue up for directory items and it doesn't
    lend itself well to shared access. While we're cleaning it up, Omar
    has changed things to use an exclusive lock when there are delayed
    items"

    * 'for-linus-4.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes

    Linus Torvalds
     
  • Pull btrfs fixes from Chris Mason:
    "I have a two part pull this time because one of the patches Dave
    Sterba collected needed to be against v4.7-rc2 or higher (we used
    rc4). I try to make my for-linus-xx branch testable on top of the
    last major so we can hand fixes to people on the list more easily, so
    I've split this pull in two.

    This first part has some fixes and two performance improvements that
    we've been testing for some time.

    Josef's two performance fixes are most notable. The transid tracking
    patch makes a big improvement on pretty much every workload"

    * 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: Force stripesize to the value of sectorsize
    btrfs: fix disk_i_size update bug when fallocate() fails
    Btrfs: fix error handling in map_private_extent_buffer
    Btrfs: fix error return code in btrfs_init_test_fs()
    Btrfs: don't do nocow check unless we have to
    btrfs: fix deadlock in delayed_ref_async_start
    Btrfs: track transid for delayed ref flushing

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "Again pretty calm weeks: we've had only a few trivial / stable
    HD-audio fixes in addition to a possible race fix for snd-dummy driver
    spotted by syzkaller"

    * tag 'sound-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: dummy: Fix a use-after-free at closing
    ALSA: hda / realtek - add two more Thinkpad IDs (5050,5053) for tpt460 fixup
    ALSA: hda - Fix the headset mic jack detection on Dell machine
    ALSA: hda/tegra: iomem fixups for sparse warnings
    ALSA: hdac_regmap - fix the register access for runtime PM

    Linus Torvalds
     
  • Pull x86 kprobe fix from Thomas Gleixner:
    "A single fix clearing the TF bit when a fault is single stepped"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    kprobes/x86: Clear TF bit in fault on single-stepping

    Linus Torvalds
     
  • Pull scheduler fixes from Thomas Gleixner:
    "A couple of scheduler fixes:

    - force watchdog reset while processing sysrq-w

    - fix a deadlock when enabling trace events in the scheduler

    - fixes to the throttled next buddy logic

    - fixes for the average accounting (missing serialization and
    underflow handling)

    - allow kernel threads for fallback to online but not active cpus"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/core: Allow kthreads to fall back to online && !active cpus
    sched/fair: Do not announce throttled next buddy in dequeue_task_fair()
    sched/fair: Initialize throttle_count for new task-groups lazily
    sched/fair: Fix cfs_rq avg tracking underflow
    kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w
    sched/debug: Fix deadlock when enabling sched events
    sched/fair: Fix post_init_entity_util_avg() serialization

    Linus Torvalds
     
  • Commit fe742fd4f90f ("Revert "btrfs: switch to ->iterate_shared()"")
    backed out the conversion to ->iterate_shared() for Btrfs because the
    delayed inode handling in btrfs_real_readdir() is racy. However, we can
    still do readdir in parallel if there are no delayed nodes.

    This is a temporary fix which upgrades the shared inode lock to an
    exclusive lock only when we have delayed items until we come up with a
    more complete solution. While we're here, rename the
    btrfs_{get,put}_delayed_items functions to make it very clear that
    they're just for readdir.

    Tested with xfstests and by doing a parallel kernel build:

    while make tinyconfig && make -j4 && git clean dqfx; do
    :
    done

    along with a bunch of parallel finds in another shell:

    while true; do
    for ((i=0; i/dev/null &
    done
    wait
    done

    Signed-off-by: Omar Sandoval
    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    Omar Sandoval
     
  • Pull locking fix from Thomas Gleixner:
    "A single fix to address a race in the static key logic"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    locking/static_key: Fix concurrent static_key_slow_inc()

    Linus Torvalds
     
  • Pull irq fix from Thomas Gleixner:
    "A single fix for the fallout from the conversion of MIPS GIC to irq
    domains"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/mips-gic: Fix IRQs in gic_dev_domain

    Linus Torvalds
     
  • Pull powerpc fixes from Michael Ellerman:
    "mm/radix (Aneesh Kumar K.V):
    - Update to tlb functions ric argument
    - Flush page walk cache when freeing page table
    - Update Radix tree size as per ISA 3.0

    mm/hash (Aneesh Kumar K.V):
    - Use the correct PPP mask when updating HPTE
    - Don't add memory coherence if cache inhibited is set

    eeh (Gavin Shan):
    - Fix invalid cached PE primary bus

    bpf/jit (Naveen N. Rao):
    - Disable classic BPF JIT on ppc64le

    .. and fix faults caused by radix patching of SLB miss handler
    (Michael Ellerman)"

    * tag 'powerpc-4.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/bpf/jit: Disable classic BPF JIT on ppc64le
    powerpc: Fix faults caused by radix patching of SLB miss handler
    powerpc/eeh: Fix invalid cached PE primary bus
    powerpc/mm/radix: Update Radix tree size as per ISA 3.0
    powerpc/mm/hash: Don't add memory coherence if cache inhibited is set
    powerpc/mm/hash: Use the correct PPP mask when updating HPTE
    powerpc/mm/radix: Flush page walk cache when freeing page table
    powerpc/mm/radix: Update to tlb functions ric argument

    Linus Torvalds
     
  • Commit b235beea9e99 ("Clarify naming of thread info/stack allocators")
    breaks the build on some powerpc configs, where THREAD_SIZE < PAGE_SIZE:

    kernel/fork.c:235:2: error: implicit declaration of function 'free_thread_stack'
    kernel/fork.c:355:8: error: assignment from incompatible pointer type
    stack = alloc_thread_stack_node(tsk, node);
    ^

    Fix it by renaming free_stack() to free_thread_stack(), and updating the
    return type of alloc_thread_stack_node().

    Fixes: b235beea9e99 ("Clarify naming of thread info/stack allocators")
    Signed-off-by: Michael Ellerman
    Signed-off-by: Linus Torvalds

    Michael Ellerman
     
  • Merge misc fixes from Andrew Morton:
    "Two weeks worth of fixes here"

    * emailed patches from Andrew Morton : (41 commits)
    init/main.c: fix initcall_blacklisted on ia64, ppc64 and parisc64
    autofs: don't get stuck in a loop if vfs_write() returns an error
    mm/page_owner: avoid null pointer dereference
    tools/vm/slabinfo: fix spelling mistake: "Ocurrences" -> "Occurrences"
    fs/nilfs2: fix potential underflow in call to crc32_le
    oom, suspend: fix oom_reaper vs. oom_killer_disable race
    ocfs2: disable BUG assertions in reading blocks
    mm, compaction: abort free scanner if split fails
    mm: prevent KASAN false positives in kmemleak
    mm/hugetlb: clear compound_mapcount when freeing gigantic pages
    mm/swap.c: flush lru pvecs on compound page arrival
    memcg: css_alloc should return an ERR_PTR value on error
    memcg: mem_cgroup_migrate() may be called with irq disabled
    hugetlb: fix nr_pmds accounting with shared page tables
    Revert "mm: disable fault around on emulated access bit architecture"
    Revert "mm: make faultaround produce old ptes"
    mailmap: add Boris Brezillon's email
    mailmap: add Antoine Tenart's email
    mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask
    mm: mempool: kasan: don't poot mempool objects in quarantine
    ...

    Linus Torvalds
     
  • Pull rdma fixes from Doug Ledford:
    "This is the second batch of queued up rdma patches for this rc cycle.

    There isn't anything really major in here. It's passed 0day,
    linux-next, and local testing across a wide variety of hardware.
    There are still a few known issues to be tracked down, but this should
    amount to the vast majority of the rdma RC fixes.

    Round two of 4.7 rc fixes:

    - A couple minor fixes to the rdma core
    - Multiple minor fixes to hfi1
    - Multiple minor fixes to mlx4/mlx4
    - A few minor fixes to i40iw"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (31 commits)
    IB/srpt: Reduce QP buffer size
    i40iw: Enable level-1 PBL for fast memory registration
    i40iw: Return correct max_fast_reg_page_list_len
    i40iw: Correct status check on i40iw_get_pble
    i40iw: Correct CQ arming
    IB/rdmavt: Correct qp_priv_alloc() return value test
    IB/hfi1: Don't zero out qp->s_ack_queue in rvt_reset_qp
    IB/hfi1: Fix deadlock with txreq allocation slow path
    IB/mlx4: Prevent cross page boundary allocation
    IB/mlx4: Fix memory leak if QP creation failed
    IB/mlx4: Verify port number in flow steering create flow
    IB/mlx4: Fix error flow when sending mads under SRIOV
    IB/mlx4: Fix the SQ size of an RC QP
    IB/mlx5: Fix wrong naming of port_rcv_data counter
    IB/mlx5: Fix post send fence logic
    IB/uverbs: Initialize ib_qp_init_attr with zeros
    IB/core: Fix false search of the IB_SA_WELL_KNOWN_GUID
    IB/core: Fix RoCE v1 multicast join logic issue
    IB/core: Fix no default GIDs when netdevice reregisters
    IB/hfi1: Send a pkey change event on driver pkey update
    ...

    Linus Torvalds
     
  • Pull HID fix from Jiri Kosina:
    "hiddev ioctl() validation fix from Scott Bauer"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
    HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES commands

    Linus Torvalds
     
  • …l/git/groeck/linux-staging

    Pull hwmon fix from Guenter Roeck:
    "Improve fan type detection for dell-smm to prevent kernel hang"

    * tag 'hwmon-for-linus-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (dell-smm) Cache fan_type() calls and change fan detection

    Linus Torvalds
     
  • Pull ACPI fix from Rafael Wysocki:
    "Stable-candidate fix for a deadlock in ACPICA introduced during the
    4.5 development cycle by a commit attempting to improve the handling
    of AML code that doesn't belong to any namespace objects in a given
    definition block (Lv Zheng)"

    * tag 'acpi-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPICA: Namespace: Fix deadlock triggered by MLC support in dynamic table loading

    Linus Torvalds
     
  • Pull power management fixes from Rafael Wysocki:
    "Fix for a latent cpufreq driver bug uncovered by a recent ACPICA
    change and several fixes for the devfreq framework, including one fix
    for an issue introduced recently.

    Specifics:

    - Fix a latent initialization issue in the pcc-cpufreq driver
    (incorrect initial value of a structure field) that has been
    uncovered by a recent ACPICA commit (Mike Galbraith).

    - Add a missing notification in an update_devfreq() error code path
    forgotten by a recent devfreq commit (Chanwoo Choi).

    - Fix devfreq device frequency initialization (Lukasz Luba).

    - Fix an incorrect IS_ERR() check in the devfreq framework discovered
    by the Smatch checker (Dan Carpenter).

    - Drop two excessive put_device() calls from the devfreq framework
    (MyungJoo Ham, Cai Zhiyong).

    - Fix a possible memory leak in the devfreq framework and drop an
    unnecessary kfree() invocation from it (MyungJoo Ham)"

    * tag 'pm-4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM / devfreq: Send the DEVFREQ_POSTCHANGE notification when target() is failed
    cpufreq: pcc-cpufreq: Fix doorbell.access_width
    PM / devfreq: fix initialization of current frequency in last status
    PM / devfreq: exynos-nocp: Remove incorrect IS_ERR() check
    PM / devfreq: remove double put_device
    PM / devfreq: fix double call put_device
    PM / devfreq: fix duplicated kfree on devfreq pointer
    PM / devfreq: devm_kzalloc to have dev pointer more precisely

    Linus Torvalds
     
  • Pull xen bug fixes from David Vrabel:

    - fix x86 PV dom0 crash during early boot on some hardware

    - fix two pciback bugs affects certain devices

    - fix potential overflow when clearing page tables in x86 PV

    * tag 'for-linus-4.7b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen-pciback: return proper values during BAR sizing
    x86/xen: avoid m2p lookup when setting early page table entries
    xen/pciback: Fix conf_space read/write overlap check.
    x86/xen: fix upper bound of pmd loop in xen_cleanhighmap()
    xen/balloon: Fix declared-but-not-defined warning

    Linus Torvalds