01 Oct, 2016

4 commits

  • Fuse allowed VFS to set mode in setattr in order to clear suid/sgid on
    chown and truncate, and (since writeback_cache) write. The problem with
    this is that it'll potentially restore a stale mode.

    The poper fix would be to let the filesystems do the suid/sgid clearing on
    the relevant operations. Possibly some are already doing it but there's no
    way we can detect this.

    So fix this by refreshing and recalculating the mode. Do this only if
    ATTR_KILL_S[UG]ID is set to not destroy performance for writes. This is
    still racy but the size of the window is reduced.

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     
  • Without "default_permissions" the userspace filesystem's lookup operation
    needs to perform the check for search permission on the directory.

    If directory does not allow search for everyone (this is quite rare) then
    userspace filesystem has to set entry timeout to zero to make sure
    permissions are always performed.

    Changing the mode bits of the directory should also invalidate the
    (previously cached) dentry to make sure the next lookup will have a chance
    of updating the timeout, if needed.

    Reported-by: Jean-Pierre André
    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     
  • In preparation for posix acl support, rework fuse to use xattr handlers and
    the generic setxattr/getxattr/listxattr callbacks. Split the xattr code
    out into it's own file, and promote symbols to module-global scope as
    needed.

    Functionally these changes have no impact, as fuse still uses a single
    handler for all xattrs which uses the old callbacks.

    Signed-off-by: Seth Forshee
    Signed-off-by: Miklos Szeredi

    Seth Forshee
     
  • Make sure userspace filesystem is returning a well formed list of xattr
    names (zero or more nonzero length, null terminated strings).

    [Michael Theall: only verify in the nonzero size case]

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     

26 Sep, 2016

9 commits

  • Linus Torvalds
     
  • Pull tracefs fixes from Steven Rostedt:
    "Al Viro has been looking at the tracefs code, and has pointed out some
    issues. This contains one fix by me and one by Al. I'm sure that
    he'll come up with more but for now I tested these patches and they
    don't appear to have any negative impact on tracing"

    * tag 'trace-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    fix memory leaks in tracing_buffers_splice_read()
    tracing: Move mutex to protect against resetting of seq data

    Linus Torvalds
     
  • When building XFS with -Werror, it now fails with:

    include/linux/pagemap.h: In function 'fault_in_multipages_readable':
    include/linux/pagemap.h:602:16: error: variable 'c' set but not used [-Werror=unused-but-set-variable]
    volatile char c;
    ^

    This is a regression caused by commit e23d4159b109 ("fix
    fault_in_multipages_...() on architectures with no-op access_ok()").
    Fix it by re-adding the "(void)c" trick taht was previously used to make
    the compiler think the variable is used.

    Signed-off-by: Dave Chinner
    Cc: Al Viro
    Signed-off-by: Linus Torvalds

    Dave Chinner
     
  • The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
    defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
    PMDs respectively as requiring balancing upon a subsequent page fault.
    User-defined PROT_NONE memory regions which also have this flag set will
    not normally invoke the NUMA balancing code as do_page_fault() will send
    a segfault to the process before handle_mm_fault() is even called.

    However if access_remote_vm() is invoked to access a PROT_NONE region of
    memory, handle_mm_fault() is called via faultin_page() and
    __get_user_pages() without any access checks being performed, meaning
    the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
    region.

    A simple means of triggering this problem is to access PROT_NONE mmap'd
    memory using /proc/self/mem which reliably results in the NUMA handling
    functions being invoked when CONFIG_NUMA_BALANCING is set.

    This issue was reported in bugzilla (issue 99101) which includes some
    simple repro code.

    There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
    added at commit c0e7cad to avoid accidentally provoking strange
    behaviour by attempting to apply NUMA balancing to pages that are in
    fact PROT_NONE. The BUG_ON()'s are consistently triggered by the repro.

    This patch moves the PROT_NONE check into mm/memory.c rather than
    invoking BUG_ON() as faulting in these pages via faultin_page() is a
    valid reason for reaching the NUMA check with the PROT_NONE page table
    flag set and is therefore not always a bug.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101
    Reported-by: Trevor Saunders
    Signed-off-by: Lorenzo Stoakes
    Acked-by: Rik van Riel
    Cc: Andrew Morton
    Cc: Mel Gorman
    Signed-off-by: Linus Torvalds

    Lorenzo Stoakes
     
  • Pull MIPS fixes from Ralf Baechle:
    "A round of 4.8 fixes:

    MIPS generic code:
    - Add a missing ".set pop" in an early commit
    - Fix memory regions reaching top of physical
    - MAAR: Fix address alignment
    - vDSO: Fix Malta EVA mapping to vDSO page structs
    - uprobes: fix incorrect uprobe brk handling
    - uprobes: select HAVE_REGS_AND_STACK_ACCESS_API
    - Avoid a BUG warning during PR_SET_FP_MODE prctl
    - SMP: Fix possibility of deadlock when bringing CPUs online
    - R6: Remove compact branch policy Kconfig entries
    - Fix size calc when avoiding IPIs for small icache flushes
    - Fix pre-r6 emulation FPU initialisation
    - Fix delay slot emulation count in debugfs

    ATH79:
    - Fix test for error return of clk_register_fixed_factor.

    Octeon:
    - Fix kernel header to work for VDSO build.
    - Fix initialization of platform device probing.

    paravirt:
    - Fix undefined reference to smp_bootstrap"

    * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
    MIPS: Fix delay slot emulation count in debugfs
    MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
    MIPS: Fix pre-r6 emulation FPU initialisation
    MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
    MIPS: Select HAVE_REGS_AND_STACK_ACCESS_API
    MIPS: Octeon: Fix platform bus probing
    MIPS: Octeon: mangle-port: fix build failure with VDSO code
    MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
    MIPS: c-r4k: Fix size calc when avoiding IPIs for small icache flushes
    MIPS: Add a missing ".set pop" in an early commit
    MIPS: paravirt: Fix undefined reference to smp_bootstrap
    MIPS: Remove compact branch policy Kconfig entries
    MIPS: MAAR: Fix address alignment
    MIPS: Fix memory regions reaching top of physical
    MIPS: uprobes: fix incorrect uprobe brk handling
    MIPS: ath79: Fix test for error return of clk_register_fixed_factor().

    Linus Torvalds
     
  • Pull one more powerpc fix from Michael Ellerman:
    "powernv/pci: Fix m64 checks for SR-IOV and window alignment from
    Russell Currey"

    * tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment

    Linus Torvalds
     
  • The fixes to the radix tree test suite show that the multi-order case is
    broken. The basic reason is that the radix tree code uses tagged
    pointers with the "internal" bit in the low bits, and calculating the
    pointer indices was supposed to mask off those bits. But gcc will
    notice that we then use the index to re-create the pointer, and will
    avoid doing the arithmetic and use the tagged pointer directly.

    This cleans the code up, using the existing is_sibling_entry() helper to
    validate the sibling pointer range (instead of open-coding it), and
    using entry_to_node() to mask off the low tag bit from the pointer. And
    once you do that, you might as well just use the now cleaned-up pointer
    directly.

    [ Side note: the multi-order code isn't actually ever used in the kernel
    right now, and the only reason I didn't just delete all that code is
    that Kirill Shutemov piped up and said:

    "Well, my ext4-with-huge-pages patchset[1] uses multi-order entries.
    It also converts shmem-with-huge-pages and hugetlb to them.

    I'm okay with converting it to other mechanism, but I need
    something. (I looked into Konstantin's RFC patchset[2]. It looks
    okay, but I don't feel myself qualified to review it as I don't
    know much about radix-tree internals.)"

    [1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com
    [2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg ]

    Reported-by: Matthew Wilcox
    Cc: Andrew Morton
    Cc: Ross Zwisler
    Cc: Johannes Weiner
    Cc: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Cedric Blancher
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • When we replace a multiorder entry, check that all indices reflect the
    new value.

    Also, compile the test suite with -O2, which shows other problems with
    the code due to some dodgy pointer operations in the radix tree code.

    Signed-off-by: Matthew Wilcox
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     

25 Sep, 2016

11 commits

  • The iter->seq can be reset outside the protection of the mutex. So can
    reading of user data. Move the mutex up to the beginning of the function.

    Fixes: d7350c3f45694 ("tracing/core: make the read callbacks reentrants")
    Cc: stable@vger.kernel.org # 2.6.30+
    Reported-by: Al Viro
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Commit 432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot
    instructions") accidentally removed use of the MIPS_FPU_EMU_INC_STATS
    macro from do_dsemulret, leading to the ds_emul file in debugfs always
    returning zero even though we perform delay slot emulations.

    Fix this by re-adding the use of the MIPS_FPU_EMU_INC_STATS macro.

    Signed-off-by: Paul Burton
    Fixes: 432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot instructions")
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/14301/
    Signed-off-by: Ralf Baechle

    Paul Burton
     
  • This patch fixes the possibility of a deadlock when bringing up
    secondary CPUs.
    The deadlock occurs because the set_cpu_online() is called before
    synchronise_count_slave(). This can cause a deadlock if the boot CPU,
    having scheduled another thread, attempts to send an IPI to the
    secondary CPU, which it sees has been marked online. The secondary is
    blocked in synchronise_count_slave() waiting for the boot CPU to enter
    synchronise_count_master(), but the boot cpu is blocked in
    smp_call_function_many() waiting for the secondary to respond to it's
    IPI request.

    Fix this by marking the CPU online in cpu_callin_map and synchronising
    counters before declaring the CPU online and calculating the maps for
    IPIs.

    Signed-off-by: Matt Redfearn
    Reported-by: Justin Chen
    Tested-by: Justin Chen
    Cc: Florian Fainelli
    Cc: stable@vger.kernel.org # v4.1+
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/14302/
    Signed-off-by: Ralf Baechle

    Matt Redfearn
     
  • Pull perf fixes from Thomas Gleixner:
    "Three fixlets for perf:

    - add a missing NULL pointer check in the intel BTS driver

    - make BTS an exclusive PMU because BTS can only handle one event at
    a time

    - ensure that exclusive events are limited to one PMU so that several
    exclusive events can be scheduled on different PMU instances"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/core: Limit matching exclusive events to one PMU
    perf/x86/intel/bts: Make it an exclusive PMU
    perf/x86/intel/bts: Make sure debug store is valid

    Linus Torvalds
     
  • Pull locking fixes from Thomas Gleixner:
    "Two smallish fixes:

    - use the proper asm constraint in the Super-H atomic_fetch_ops

    - a trivial typo fix in the Kconfig help text"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
    locking/atomic, arch/sh: Fix ATOMIC_FETCH_OP()

    Linus Torvalds
     
  • Pull EFI fixes from Thomas Gleixner:
    "Two fixes for EFI/PAT:

    - a 32bit overflow bug in the PAT code which was unearthed by the
    large EFI mappings

    - prevent a boot hang on large systems when EFI mixed mode is enabled
    but not used"

    * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/efi: Only map RAM into EFI page tables if in mixed-mode
    x86/mm/pat: Prevent hang during boot when mapping pages

    Linus Torvalds
     
  • Pull irq fixes from Thomas Gleixner:
    "Three fixes for irq core and irq chip drivers:

    - Do not set the irq type if type is NONE. Fixes a boot regression
    on various SoCs

    - Use the proper cpu for setting up the GIC target list. Discovered
    by the cpumask debugging code.

    - A rather large fix for the MIPS-GIC so per cpu local interrupts
    work again. This was discovered late because the code falls back
    to slower timers which use normal device interrupts"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/mips-gic: Fix local interrupts
    irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning
    genirq: Skip chained interrupt trigger setup if type is IRQ_TYPE_NONE

    Linus Torvalds
     
  • Merge VM fixes from High Dickins:
    "I get the impression that Andrew is away or busy at the moment, so I'm
    going to send you three independent uncontroversial little mm fixes
    directly - though none is strictly a 4.8 regression fix.

    - shmem: fix tmpfs to handle the huge= option properly from Toshi
    Kani is a one-liner to fix a major embarrassment in 4.8's hugepages
    on tmpfs feature: although Hillf pointed it out in June, somehow
    both Kirill and I repeatedly dropped the ball on this one. You
    might wonder if the feature got tested at all with that bug in:
    yes, it did, but for wider testing coverage, Kirill and I had each
    relied too much on an override which bypasses that condition.

    - huge tmpfs: fix Committed_AS leak just a run-of-the-mill accounting
    fix in the same feature.

    - mm: delete unnecessary and unsafe init_tlb_ubc() is an unrelated
    fix to 4.3's TLB flush batching in reclaim: the bug would be rare,
    and none of us will be shamed if this one misses 4.8; but it got
    such a quick ack from Mel today that I'm inclined to offer it along
    with the first two"

    * emailed patches from Hugh Dickins :
    mm: delete unnecessary and unsafe init_tlb_ubc()
    huge tmpfs: fix Committed_AS leak
    shmem: fix tmpfs to handle the huge= option properly

    Linus Torvalds
     
  • init_tlb_ubc() looked unnecessary to me: tlb_ubc is statically
    initialized with zeroes in the init_task, and copied from parent to
    child while it is quiescent in arch_dup_task_struct(); so I went to
    delete it.

    But inserted temporary debug WARN_ONs in place of init_tlb_ubc() to
    check that it was always empty at that point, and found them firing:
    because memcg reclaim can recurse into global reclaim (when allocating
    biosets for swapout in my case), and arrive back at the init_tlb_ubc()
    in shrink_node_memcg().

    Resetting tlb_ubc.flush_required at that point is wrong: if the upper
    level needs a deferred TLB flush, but the lower level turns out not to,
    we miss a TLB flush. But fortunately, that's the only part of the
    protocol that does not nest: with the initialization removed, cpumask
    collects bits from upper and lower levels, and flushes TLB when needed.

    Fixes: 72b252aed506 ("mm: send one IPI per CPU to TLB flush all entries after unmapping pages")
    Signed-off-by: Hugh Dickins
    Acked-by: Mel Gorman
    Cc: stable@vger.kernel.org # 4.3+
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Under swapping load on huge tmpfs, /proc/meminfo's Committed_AS grows
    bigger and bigger: just a cosmetic issue for most users, but disabling
    for those who run without overcommit (/proc/sys/vm/overcommit_memory 2).

    shmem_uncharge() was forgetting to unaccount __vm_enough_memory's
    charge, and shmem_charge() was forgetting it on the filesystem-full
    error path.

    Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
    Signed-off-by: Hugh Dickins
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • shmem_get_unmapped_area() checks SHMEM_SB(sb)->huge incorrectly, which
    leads to a reversed effect of "huge=" mount option.

    Fix the check in shmem_get_unmapped_area().

    Note, the default value of SHMEM_SB(sb)->huge remains as
    SHMEM_HUGE_NEVER. User will need to specify "huge=" option to enable
    huge page mappings.

    Reported-by: Hillf Danton
    Signed-off-by: Toshi Kani
    Acked-by: Kirill A. Shutemov
    Reviewed-by: Aneesh Kumar K.V
    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Toshi Kani
     

24 Sep, 2016

12 commits

  • Pull i2c fixes from Wolfram Sang:
    "Three driver bugfixes: fixing uninitialized memory pointers (eg20t),
    pm/clock imbalance (qup), and a wrongly set cached variable (pc954x)"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
    i2c: mux: pca954x: retry updating the mux selection on failure
    i2c-eg20t: fix race between i2c init and interrupt enable

    Linus Torvalds
     
  • Pull input updates from Dmitry Torokhov:
    "Just a fix up for the firmware handling to the Silead driver (which is
    a new driver in this release)"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: silead_gsl1680 - use "silead/" prefix for firmware loading
    Input: silead_gsl1680 - document firmware-name, fix implementation

    Linus Torvalds
     
  • Pull block fixes from Jens Axboe:
    "Three fixes, two regressions and one that poses a problem in blk-mq
    with the new nvmef code"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
    nvme-rdma: only clear queue flags after successful connect
    blk-throttle: Extend slice if throttle group is not empty

    Linus Torvalds
     
  • Pull btrfs fixes from Chris Mason:
    "Josef fixed a problem when quotas are enabled with his latest ENOSPC
    rework, and Jeff added more checks into the subvol ioctls to avoid
    tripping up lookup_one_len"

    * 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    btrfs: ensure that file descriptor used with subvol ioctls is a dir
    Btrfs: handle quota reserve failure properly

    Linus Torvalds
     
  • Pull regmap fix from Mark Brown:
    "A fix for an issue with double locking that was introduced earlier
    this release. I'd missed in review that we were already in a locked
    region when trying to drop part of the cache"

    * tag 'regmap-fix-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
    regmap: fix deadlock on _regmap_raw_write() error path

    Linus Torvalds
     
  • Pull crypto fixes from Herbert Xu:
    "This fixes a regression in RSA that was only half-fixed earlier in the
    cycle. It also fixes an older regression that breaks the keyring
    subsystem"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: rsa-pkcs1pad - Handle leading zero for decryption
    KEYS: Fix skcipher IV clobbering

    Linus Torvalds
     
  • Pull arm64 fixes from Catalin Marinas:
    "A couple of last-minute arm64 fixes for 4.8:

    - Fix secondary CPU to NUMA node assignment

    - Fix kgdb breakpoint insertion in read-only text sections (when
    CONFIG_DEBUG_RODATA or CONFIG_DEBUG_SET_MODULE_RONX are enabled)"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: kgdb: handle read-only text / modules
    arm64: Call numa_store_cpu_info() earlier.

    Linus Torvalds
     
  • Pull MTD fixes from Richard Weinberger:
    "NAND Fixes for 4.8-rc8.

    This contains fixes for bugs which got introduced in -rc1. Usually
    Brian takes NAND patches from Boris, but since Brian is very busy
    these days with other stuff and Boris is not yet member of the
    kernel.org web of trust I stepped in.

    Boris will be in Berlin at ELCE, I'll sign his key and hopefully other
    Kernel developers too such that he can issue his own pull requests
    soon.

    Summary:

    - Fix a wrong OOB layout definition in the mxc driver
    - Fix incorrect ECC handling in the mtk driver"

    * tag 'tags/nand-fixes-for-4.8-rc8' of git://git.infradead.org/linux-ubifs:
    mtd: nand: mxc: fix obiwan error in mxc_nand_v[12]_ooblayout_free() functions
    mtd: nand: fix chances to create incomplete ECC data when writing
    mtd: nand: fix generating over-boundary ECC data when writing

    Linus Torvalds
     
  • Pull MMC fix from Ulf Hansson:
    "MMC host:

    - dw_mmc: fix the spamming log message"

    * tag 'mmc-v4.8-rc7' of git://git.linaro.org/people/ulf.hansson/mmc:
    mmc: dw_mmc: fix the spamming log message

    Linus Torvalds
     
  • Pull configfs fix from Christoph Hellwig:
    "One more trivial fix for the binary attribute code from Phil Turnbull"

    * tag 'configfs-for-4.8-2' of git://git.infradead.org/users/hch/configfs:
    configfs: Return -EFBIG from configfs_write_bin_file.

    Linus Torvalds
     
  • This provides the caller a feedback that a given hctx is not mapped and thus
    no command can be sent on it.

    Signed-off-by: Christoph Hellwig
    Tested-by: Steve Wise
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • In the mipsr2_decoder() function, used to emulate pre-MIPSr6
    instructions that were removed in MIPSr6, the init_fpu() function is
    called if a removed pre-MIPSr6 floating point instruction is the first
    floating point instruction used by the task. However, init_fpu()
    performs varous actions that rely upon not being migrated. For example
    in the most basic case it sets the coprocessor 0 Status.CU1 bit to
    enable the FPU & then loads FP register context into the FPU registers.
    If the task were to migrate during this time, it may end up attempting
    to load FP register context on a different CPU where it hasn't set the
    CU1 bit, leading to errors such as:

    do_cpu invoked from kernel context![#2]:
    CPU: 2 PID: 7338 Comm: fp-prctl Tainted: G D 4.7.0-00424-g49b0c82 #2
    task: 838e4000 ti: 88d38000 task.ti: 88d38000
    $ 0 : 00000000 00000001 ffffffff 88d3fef8
    $ 4 : 838e4000 88d38004 00000000 00000001
    $ 8 : 3400fc01 801f8020 808e9100 24000000
    $12 : dbffffff 807b69d8 807b0000 00000000
    $16 : 00000000 80786150 00400fc4 809c0398
    $20 : 809c0338 0040273c 88d3ff28 808e9d30
    $24 : 808e9d30 00400fb4
    $28 : 88d38000 88d3fe88 00000000 8011a2ac
    Hi : 0040273c
    Lo : 88d3ff28
    epc : 80114178 _restore_fp+0x10/0xa0
    ra : 8011a2ac mipsr2_decoder+0xd5c/0x1660
    Status: 1400fc03 KERNEL EXL IE
    Cause : 1080002c (ExcCode 0b)
    PrId : 0001a920 (MIPS I6400)
    Modules linked in:
    Process fp-prctl (pid: 7338, threadinfo=88d38000, task=838e4000, tls=766527d0)
    Stack : 00000000 00000000 00000000 88d3fe98 00000000 00000000 809c0398 809c0338
    808e9100 00000000 88d3ff28 00400fc4 00400fc4 0040273c 7fb69e18 004a0000
    004a0000 004a0000 7664add0 8010de18 00000000 00000000 88d3fef8 88d3ff28
    808e9100 00000000 766527d0 8010e534 000c0000 85755000 8181d580 00000000
    00000000 00000000 004a0000 00000000 766527d0 7fb69e18 004a0000 80105c20
    ...
    Call Trace:
    [] _restore_fp+0x10/0xa0
    [] mipsr2_decoder+0xd5c/0x1660
    [] do_ri+0x90/0x6b8
    [] ret_from_exception+0x0/0x10

    Fix this by disabling preemption around the call to init_fpu(), ensuring
    that it starts & completes on one CPU.

    Signed-off-by: Paul Burton
    Fixes: b0a668fb2038 ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6")
    Cc: linux-mips@linux-mips.org
    Cc: stable@vger.kernel.org # v4.0+
    Patchwork: https://patchwork.linux-mips.org/patch/14305/
    Signed-off-by: Ralf Baechle

    Paul Burton
     

23 Sep, 2016

4 commits

  • Handle read-only cases when CONFIG_DEBUG_RODATA (4.0) or
    CONFIG_DEBUG_SET_MODULE_RONX (3.18) are enabled by using
    aarch64_insn_write() instead of probe_kernel_write() as introduced by
    commit 2f896d586610 ("arm64: use fixmap for text patching") in 4.0.

    Fixes: 11d91a770f1f ("arm64: Add CONFIG_DEBUG_SET_MODULE_RONX support")
    Signed-off-by: AKASHI Takahiro
    Reviewed-by: Mark Rutland
    Cc: Will Deacon
    Cc: Jason Wessel
    Signed-off-by: Catalin Marinas

    AKASHI Takahiro
     
  • The wq_numa_init() function makes a private CPU to node map by calling
    cpu_to_node() early in the boot process, before the non-boot CPUs are
    brought online. Since the default implementation of cpu_to_node()
    returns zero for CPUs that have never been brought online, the
    workqueue system's view is that *all* CPUs are on node zero.

    When the unbound workqueue for a non-zero node is created, the
    tsk_cpus_allowed() for the worker threads is the empty set because
    there are, in the view of the workqueue system, no CPUs on non-zero
    nodes. The code in try_to_wake_up() using this empty cpumask ends up
    using the cpumask empty set value of NR_CPUS as an index into the
    per-CPU area pointer array, and gets garbage as it is one past the end
    of the array. This results in:

    [ 0.881970] Unable to handle kernel paging request at virtual address fffffb1008b926a4
    [ 1.970095] pgd = fffffc00094b0000
    [ 1.973530] [fffffb1008b926a4] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
    [ 1.982610] Internal error: Oops: 96000004 [#1] SMP
    [ 1.987541] Modules linked in:
    [ 1.990631] CPU: 48 PID: 295 Comm: cpuhp/48 Tainted: G W 4.8.0-rc6-preempt-vol+ #9
    [ 1.999435] Hardware name: Cavium ThunderX CN88XX board (DT)
    [ 2.005159] task: fffffe0fe89cc300 task.stack: fffffe0fe8b8c000
    [ 2.011158] PC is at try_to_wake_up+0x194/0x34c
    [ 2.015737] LR is at try_to_wake_up+0x150/0x34c
    [ 2.020318] pc : [] lr : [] pstate: 600000c5
    [ 2.027803] sp : fffffe0fe8b8fb10
    [ 2.031149] x29: fffffe0fe8b8fb10 x28: 0000000000000000
    [ 2.036522] x27: fffffc0008c63bc8 x26: 0000000000001000
    [ 2.041896] x25: fffffc0008c63c80 x24: fffffc0008bfb200
    [ 2.047270] x23: 00000000000000c0 x22: 0000000000000004
    [ 2.052642] x21: fffffe0fe89d25bc x20: 0000000000001000
    [ 2.058014] x19: fffffe0fe89d1d00 x18: 0000000000000000
    [ 2.063386] x17: 0000000000000000 x16: 0000000000000000
    [ 2.068760] x15: 0000000000000018 x14: 0000000000000000
    [ 2.074133] x13: 0000000000000000 x12: 0000000000000000
    [ 2.079505] x11: 0000000000000000 x10: 0000000000000000
    [ 2.084879] x9 : 0000000000000000 x8 : 0000000000000000
    [ 2.090251] x7 : 0000000000000040 x6 : 0000000000000000
    [ 2.095621] x5 : ffffffffffffffff x4 : 0000000000000000
    [ 2.100991] x3 : 0000000000000000 x2 : 0000000000000000
    [ 2.106364] x1 : fffffc0008be4c24 x0 : ffffff0ffffada80
    [ 2.111737]
    [ 2.113236] Process cpuhp/48 (pid: 295, stack limit = 0xfffffe0fe8b8c020)
    [ 2.120102] Stack: (0xfffffe0fe8b8fb10 to 0xfffffe0fe8b90000)
    [ 2.125914] fb00: fffffe0fe8b8fb80 fffffc00080e7648
    .
    .
    .
    [ 2.442859] Call trace:
    [ 2.445327] Exception stack(0xfffffe0fe8b8f940 to 0xfffffe0fe8b8fa70)
    [ 2.451843] f940: fffffe0fe89d1d00 0000040000000000 fffffe0fe8b8fb10 fffffc00080e7468
    [ 2.459767] f960: fffffe0fe8b8f980 fffffc00080e4958 ffffff0ff91ab200 fffffc00080e4b64
    [ 2.467690] f980: fffffe0fe8b8f9d0 fffffc00080e515c fffffe0fe8b8fa80 0000000000000000
    [ 2.475614] f9a0: fffffe0fe8b8f9d0 fffffc00080e58e4 fffffe0fe8b8fa80 0000000000000000
    [ 2.483540] f9c0: fffffe0fe8d10000 0000000000000040 fffffe0fe8b8fa50 fffffc00080e5ac4
    [ 2.491465] f9e0: ffffff0ffffada80 fffffc0008be4c24 0000000000000000 0000000000000000
    [ 2.499387] fa00: 0000000000000000 ffffffffffffffff 0000000000000000 0000000000000040
    [ 2.507309] fa20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    [ 2.515233] fa40: 0000000000000000 0000000000000000 0000000000000000 0000000000000018
    [ 2.523156] fa60: 0000000000000000 0000000000000000
    [ 2.528089] [] try_to_wake_up+0x194/0x34c
    [ 2.533723] [] wake_up_process+0x28/0x34
    [ 2.539275] [] create_worker+0x110/0x19c
    [ 2.544824] [] alloc_unbound_pwq+0x3cc/0x4b0
    [ 2.550724] [] wq_update_unbound_numa+0x10c/0x1e4
    [ 2.557066] [] workqueue_online_cpu+0x220/0x28c
    [ 2.563234] [] cpuhp_invoke_callback+0x6c/0x168
    [ 2.569398] [] cpuhp_up_callbacks+0x44/0xe4
    [ 2.575210] [] cpuhp_thread_fun+0x13c/0x148
    [ 2.581027] [] smpboot_thread_fn+0x19c/0x1a8
    [ 2.586929] [] kthread+0xdc/0xf0
    [ 2.591776] [] ret_from_fork+0x10/0x50
    [ 2.597147] Code: b00057e1 91304021 91005021 b8626822 (b8606821)
    [ 2.603464] ---[ end trace 58c0cd36b88802bc ]---
    [ 2.608138] Kernel panic - not syncing: Fatal exception

    Fix by moving call to numa_store_cpu_info() for all CPUs into
    smp_prepare_cpus(), which happens before wq_numa_init(). Since
    smp_store_cpu_info() now contains only a single function call,
    simplify by removing the function and out-lining its contents.

    Suggested-by: Robert Richter
    Fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.")
    Cc: # 4.7.x-
    Signed-off-by: David Daney
    Reviewed-by: Robert Richter
    Tested-by: Yisheng Xie
    Signed-off-by: Catalin Marinas

    David Daney
     
  • Fix the indefinitiley -> indefinitely typo in Kconfig.debug.

    Signed-off-by: Vivien Didelot
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20160922205513.17821-1-vivien.didelot@savoirfairelinux.com
    Signed-off-by: Ingo Molnar

    Vivien Didelot
     
  • Otherwise, nvme_rdma_stop_and_clear_queue() will incorrectly
    try to stop/free rdma qps/cm_ids that are already freed.

    Fixes: e89ca58f9c90 ("nvme-rdma: add DELETING queue flag")
    Reported-by: Steve Wise
    Tested-by: Steve Wise
    Signed-off-by: Sagi Grimberg
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Sagi Grimberg