23 Apr, 2016

2 commits

  • lock_chain::base is used to store an index into the chain_hlocks[]
    array, however that array contains more elements than can be indexed
    using the u16.

    Change the lock_chain structure to use a bitfield to encode the data
    it needs and add BUILD_BUG_ON() assertions to check the fields are
    wide enough.

    Also, for DEBUG_LOCKDEP, assert that we don't run out of elements of
    that array; as that would wreck the collision detectoring.

    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alfredo Alvarez Fernandez
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Sedat Dilek
    Cc: Theodore Ts'o
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20160330093659.GS3408@twins.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • task_irq_context() returns the encoded irq_context of the task, the
    return value is encoded in the same as ->irq_context of held_lock.

    Always return 0 if !(CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING)

    Signed-off-by: Boqun Feng
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Josh Triplett
    Cc: Lai Jiangshan
    Cc: Linus Torvalds
    Cc: Mathieu Desnoyers
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: sasha.levin@oracle.com
    Link: http://lkml.kernel.org/r/1455602265-16490-2-git-send-email-boqun.feng@gmail.com
    Signed-off-by: Ingo Molnar

    Boqun Feng
     

21 Apr, 2016

2 commits

  • The recent decoupling of pagefault disable and preempt disable added an
    explicit preempt_disable/enable() pair to the futex_atomic_cmpxchg_inatomic()
    implementation in asm-generic/futex.h. But it forgot to add preempt_enable()
    calls to the error handling code pathes, which results in a preemption count
    imbalance.

    This is observable on boot when the test for atomic_cmpxchg() is calling
    futex_atomic_cmpxchg_inatomic() on a NULL pointer.

    Add the missing preempt_enable() calls to the error handling code pathes.

    [ tglx: Massaged changelog ]

    Fixes: d9b9ff8c1889 ("sched/preempt, futex: Disable preemption in UP futex_atomic_cmpxchg_inatomic() explicitly")
    Signed-off-by: Romain Perier
    Cc: linux-arch@vger.kernel.org
    Cc: Thomas Petazzoni
    Cc: Arnd Bergmann
    Cc: Peter Zijlstra
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/1460640963-690-1-git-send-email-romain.perier@free-electrons.com
    Signed-off-by: Thomas Gleixner

    Romain Perier
     
  • Otherwise an incoming waker on the dest hash bucket can miss
    the waiter adding itself to the plist during the lockless
    check optimization (small window but still the correct way
    of doing this); similarly to the decrement counterpart.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Davidlohr Bueso
    Cc: Davidlohr Bueso
    Cc: bigeasy@linutronix.de
    Cc: dvhart@infradead.org
    Cc: stable@kernel.org
    Link: http://lkml.kernel.org/r/1461208164-29150-1-git-send-email-dave@stgolabs.net
    Signed-off-by: Thomas Gleixner

    Davidlohr Bueso
     

20 Apr, 2016

1 commit

  • If userspace calls UNLOCK_PI unconditionally without trying the TID -> 0
    transition in user space first then the user space value might not have the
    waiters bit set. This opens the following race:

    CPU0 CPU1
    uval = get_user(futex)
    lock(hb)
    lock(hb)
    futex |= FUTEX_WAITERS
    ....
    unlock(hb)

    cmpxchg(futex, uval, newval)

    So the cmpxchg fails and returns -EINVAL to user space, which is wrong because
    the futex value is valid.

    To handle this (yes, yet another) corner case gracefully, check for a flag
    change and retry.

    [ tglx: Massaged changelog and slightly reworked implementation ]

    Fixes: ccf9e6a80d9e ("futex: Make unlock_pi more robust")
    Signed-off-by: Sebastian Andrzej Siewior
    Cc: stable@vger.kernel.org
    Cc: Davidlohr Bueso
    Cc: Darren Hart
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1460723739-5195-1-git-send-email-bigeasy@linutronix.de
    Signed-off-by: Thomas Gleixner

    Sebastian Andrzej Siewior
     

19 Apr, 2016

1 commit

  • While playing with the qstat statistics (in /qlockstat/) I ran into
    the following splat on a VM when opening pv_hash_hops:

    divide error: 0000 [#1] SMP
    ...
    RIP: 0010:[] [] qstat_read+0x12e/0x1e0
    ...
    Call Trace:
    [] ? mem_cgroup_commit_charge+0x6c/0xd0
    [] ? page_add_new_anon_rmap+0x8c/0xd0
    [] ? handle_mm_fault+0x1439/0x1b40
    [] ? do_mmap+0x449/0x550
    [] ? __vfs_read+0x23/0xd0
    [] ? rw_verify_area+0x52/0xd0
    [] ? vfs_read+0x81/0x120
    [] ? SyS_read+0x42/0xa0
    [] ? entry_SYSCALL_64_fastpath+0x1e/0xa8

    Fix this by verifying that qstat_pv_kick_unlock is in fact non-zero,
    similarly to what the qstat_pv_latency_wake case does, as if nothing
    else, this can come from resetting the statistics, thus having 0 kicks
    should be quite valid in this context.

    Signed-off-by: Davidlohr Bueso
    Reviewed-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dave@stgolabs.net
    Cc: waiman.long@hpe.com
    Link: http://lkml.kernel.org/r/1460961103-24953-1-git-send-email-dave@stgolabs.net
    Signed-off-by: Ingo Molnar

    Davidlohr Bueso
     

18 Apr, 2016

2 commits


17 Apr, 2016

11 commits


16 Apr, 2016

7 commits

  • Pull block fixes from Jens Axboe:
    "A few fixes for the current series. This contains:

    - Two fixes for NVMe:

    One fixes a reset race that can be triggered by repeated
    insert/removal of the module.

    The other fixes an issue on some platforms, where we get probe
    timeouts since legacy interrupts isn't working. This used not to
    be a problem since we had the worker thread poll for completions,
    but since that was killed off, it means those poor souls can't
    successfully probe their NVMe device. Use a proper IRQ check and
    probe (msi-x -> msi ->legacy), like most other drivers to work
    around this. Both from Keith.

    - A loop corruption issue with offset in iters, from Ming Lei.

    - A fix for not having the partition stat per cpu ref count
    initialized before sending out the KOBJ_ADD, which could cause user
    space to access the counter prior to initialization. Also from
    Ming Lei.

    - A fix for using the wrong congestion state, from Kaixu Xia"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    block: loop: fix filesystem corruption in case of aio/dio
    NVMe: Always use MSI/MSI-x interrupts
    NVMe: Fix reset/remove race
    writeback: fix the wrong congested state variable definition
    block: partition: initialize percpuref before sending out KOBJ_ADD

    Linus Torvalds
     
  • Pull libnvdimm fixes from Ross Zwisler:
    "Two fixes:

    - Fix memcpy_from_pmem() to fallback to memcpy() for architectures
    where CONFIG_ARCH_HAS_PMEM_API=n.

    - Add a comment explaining why we write data twice when clearing
    poison in pmem_do_bvec().

    This has passed a boot test on an X86_32 config, which was the
    architecture where issue #1 above was first noticed"

    Dan Williams adds:
    "We're giving this multi-maintainer setup a shot, so expect libnvdimm
    pull requests from either Ross or I going forward"

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    libnvdimm, pmem: clarify the write+clear_poison+write flow
    pmem: fix BUG() error in pmem.h:48 on X86_32

    Linus Torvalds
     
  • Pull MTD fix from Brian Norris:
    "One MTD fix for v4.6-rc4:

    In the v4.4 cycle, we relaxed the requirement for assigning
    mtd->owner, but we didn't remove this error case. It's hit only
    by drivers that are both:

    (a) using nand_scan() directly
    and
    (b) built as modules

    We haven't seen explicit complaints about this (most use cases don't
    fit one or both of the above), but we should definitely not be
    BUG()'ing here"

    * tag 'for-linus-20160415' of git://git.infradead.org/linux-mtd:
    mtd: nand: Drop mtd.owner requirement in nand_scan

    Linus Torvalds
     
  • Pull MMC fixes from Ulf Hansson:
    "Here are a couple of mmc fixes intended for v4.6 rc4.

    Regarding the fix for the regression about mmcblk device indexes. The
    approach taken to solve the problem seems to be good enough. There
    were some discussions around the solution, but it seems like people
    were happy about it in the end.

    MMC core:
    - Restore similar old behaviour when assigning mmcblk device indexes

    MMC host:
    - tegra: Disable UHS-I modes for Tegra124 to fix regression"

    * tag 'mmc-v4.6-rc3' of git://git.linaro.org/people/ulf.hansson/mmc:
    mmc: tegra: Disable UHS-I modes for Tegra124
    mmc: block: Use the mmc host device index as the mmcblk device index

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "This contains fixes for exynos, amdgpu, radeon, i915 and qxl.

    It also contains some fixes to the core drm edid parser.

    qxl:
    - fix for a cursor hotspot issue

    radeon:
    - some MST fixes that I've been running locally and make my monitor a
    bit happier

    exynos:
    - fix some regressions and build fixes

    amdgpu:
    - a couple of small fixes

    i915:
    - two DP MST fixes and a couple of other regression fixes

    Nothing too out of the ordinary or surprising at this point"

    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    drm/exynos: Use VIDEO_SAMSUNG_S5P_G2D=n as G2D Kconfig dependency
    drm/exynos: fix a warning message
    drm/exynos: mic: fix an error code
    drm/exynos: fimd: fix broken dp_clock control
    drm/exynos: build fbdev code conditionally
    drm/exynos: fix adjusted_mode pointer in exynos_plane_mode_set
    drm/exynos: fix error handling in exynos_drm_subdrv_open
    drm/amd/amdgpu: fix irq domain remove for tonga ih
    drm/i915: fix deadlock on lid open
    drm/radeon: use helper for mst connector dpms.
    drm/radeon/mst: port some MST setup code from DAL.
    drm/amdgpu: add invisible pin size statistic
    drm/edid: Fix DMT 1024x768@43Hz (interlaced) timings
    drm/i915: Exit cherryview_irq_handler() after one pass
    drm/i915: Call intel_dp_mst_resume() before resuming displays
    drm/i915: Fix race condition in intel_dp_destroy_mst_connector()
    drm/edid: Fix parsing of EDID 1.4 Established Timings III descriptor
    drm/edid: Fix EDID Established Timings I and II
    drm/qxl: fix cursor position with non-zero hotspot

    Linus Torvalds
     
  • Pull parisc ftrace fixes from Helge Deller:
    "This is (most likely) the last pull request for v4.6 for the parisc
    architecture.

    It fixes the FTRACE feature for parisc, which is horribly broken since
    quite some time and doesn't even compile. This patch just fixes the
    bare minimum (it actually removes more lines than it adds), so that
    the function tracer works again on 32- and 64bit kernels.

    I've queued up additional patches on top of this patch which e.g. add
    the syscall tracer, but those have to wait for the merge window for
    v4.7."

    * 'parisc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Fix ftrace function tracer

    Linus Torvalds
     
  • The ACPI specification does not specify the state of data after a clear
    poison operation. Potential future libnvdimm bus implementations for
    other architectures also might not specify or disagree on the state of
    data after clear poison. Clarify why we write twice.

    Reported-by: Jeff Moyer
    Reported-by: Vishal Verma
    Signed-off-by: Dan Williams
    Signed-off-by: Ross Zwisler
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Jeff Moyer
    Reviewed-by: Vishal Verma

    Dan Williams
     

15 Apr, 2016

14 commits

  • Starting from commit e36f620428(block: split bios to max possible length),
    block core starts to split bio in the middle of bvec.

    Unfortunately loop dio/aio doesn't consider this situation, and
    always treat 'iter.iov_offset' as zero. Then filesystem corruption
    is observed.

    This patch figures out the offset of the base bvevc via
    'bio->bi_iter.bi_bvec_done' and fixes the issue by passing the offset
    to iov iterator.

    Fixes: e36f6204288088f (block: split bios to max possible length)
    Cc: Keith Busch
    Cc: Al Viro
    Cc: stable@vger.kernel.org (4.5)
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • Pull x86 fixes from Ingo Molnar:
    "Misc fixes: a binutils fix, an lguest fix, an mcelog fix and a missing
    documentation fix"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mce: Avoid using object after free in genpool
    lguest, x86/entry/32: Fix handling of guest syscalls using interrupt gates
    x86/build: Build compressed x86 kernels as PIE
    x86/mm/pkeys: Add missing Documentation

    Linus Torvalds
     
  • Pull mm gup cleanup from Ingo Molnar:
    "This removes the ugly get-user-pages API hack, now that all upstream
    code has been migrated to it"

    ("ugly" is putting it mildly. But it worked.. - Linus)

    * 'mm-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    mm/gup: Remove the macro overload API migration helpers from the get_user*() APIs

    Linus Torvalds
     
  • Pull device mapper fixes from Mike Snitzer:

    - fix a 4.6-rc1 bio-based DM 'struct dm_target_io' leak in an error
    path

    - stable@ fix for DM cache metadata's READ_LOCK macros that were
    incorrectly returning error if the block manager was in read-only
    mode; also cleanup multi-statement macros to use do {} while(0)

    * tag 'dm-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    dm cache metadata: fix READ_LOCK macros and cleanup WRITE_LOCK macros
    dm: fix dm_target_io leak if clone_bio() returns an error

    Linus Torvalds
     
  • …erry.reding/linux-pwm

    Pull pwm fix from Thierry Reding:
    "A single one-line fix to turn the regmap cache from an RB-tree to a
    flat cache to avoid lockdep and abort issues"

    * tag 'pwm/for-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
    pwm: fsl-ftm: Use flat regmap cache

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "We've had a very calm development cycle, so far. Here are the few
    fixes for HD-audio and USB-audio, all of which are small and easy"

    * tag 'sound-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: hda - Fix inconsistent monitor_present state until repoll
    ALSA: hda - Fix regression of monitor_present flag in eld proc file
    ALSA: usb-audio: Skip volume controls triggers hangup on Dell USB Dock
    ALSA: hda/realtek - Enable the ALC292 dock fixup on the Thinkpad T460s
    ALSA: sscape: Use correct format identifier for size_t
    ALSA: usb-audio: Add a quirk for Plantronics BT300
    ALSA: usb-audio: Add a sample rate quirk for Phoenix Audio TMX320
    ALSA: hda - Bind with i915 only when Intel graphics is present

    Linus Torvalds
     
  • Pull mailbox fixes from Jussi Brar:
    "Misc fixes:

    mailbox-test driver:
    - prevent memory leak and another cosmetic change

    mailbox:
    - change the returned error code

    Xgene driver:
    - return -ENOMEM instead of PTR_ERR for failed devm_kzalloc"

    * 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
    mailbox: Stop using ENOSYS for anything other than unimplemented syscalls
    mailbox: mailbox-test: Prevent memory leak
    mailbox: mailbox-test: Use more consistent format for calling copy_from_user()
    mailbox: xgene-slimpro: Fix wrong test for devm_kzalloc

    Linus Torvalds
     
  • Pull f2fs/fscrypto fixes from Jaegeuk Kim:
    "In addition to f2fs/fscrypto fixes, I've added one patch which
    prevents RCU mode lookup in d_revalidate, as Al mentioned.

    These patches fix f2fs and fscrypto based on -rc3 bug fixes in ext4
    crypto, which have not yet been fully propagated as follows.

    - use of dget_parent and file_dentry to avoid crashes
    - disallow RCU-mode lookup in d_invalidate
    - disallow -ENOMEM in the core data encryption path"

    * tag 'for-linus-4.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
    ext4/fscrypto: avoid RCU lookup in d_revalidate
    fscrypto: don't let data integrity writebacks fail with ENOMEM
    f2fs: use dget_parent and file_dentry in f2fs_file_open
    fscrypto: use dget_parent() in fscrypt_d_revalidate()

    Linus Torvalds
     
  • Pull crypto fixes from Herbert Xu:
    "This fixes an NFS regression caused by the skcipher/hash conversion in
    sunrpc. It also fixes a build problem in certain configurations with
    bcm63xx"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    hwrng: bcm63xx - fix device tree compilation
    sunrpc: Fix skcipher/shash conversion

    Linus Torvalds
     
  • Pull keys bugfixes from James Morris:
    "Two bugfixes for Keys related code"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    ASN.1: fix open failure check on headername
    assoc_array: don't call compare_object() on a node

    Linus Torvalds
     
  • The READ_LOCK macro was incorrectly returning -EINVAL if
    dm_bm_is_read_only() was true -- it will always be true once the cache
    metadata transitions to read-only by dm_cache_metadata_set_read_only().

    Wrap READ_LOCK and WRITE_LOCK multi-statement macros in do {} while(0).
    Also, all accesses of the 'cmd' argument passed to these related macros
    are now encapsulated in parenthesis.

    A follow-up patch can be developed to eliminate the use of macros in
    favor of pure C code. Avoiding that now given that this needs to apply
    to stable@.

    Reported-by: Ben Hutchings
    Signed-off-by: Mike Snitzer
    Fixes: d14fcf3dd79 ("dm cache: make sure every metadata function checks fail_io")
    Cc: stable@vger.kernel.org

    Mike Snitzer
     
  • Multiple users have reported device initialization failure due the driver
    not receiving legacy PCI interrupts. This is not unique to any particular
    controller, but has been observed on multiple platforms.

    There have been no issues reported or observed when with message signaled
    interrupts, so this patch attempts to use MSI-x during initialization,
    falling back to MSI. If that fails, legacy would become the default.

    The setup_io_queues error handling had to change as a result: the admin
    queue's msix_entry used to be initialized to the legacy IRQ. The case
    where nr_io_queues is 0 would fail request_irq when setting up the admin
    queue's interrupt since re-enabling MSI-x fails with 0 vectors, leaving
    the admin queue's msix_entry invalid. Instead, return success immediately.

    Reported-by: Tim Muhlemmer
    Reported-by: Jon Derrick
    Signed-off-by: Keith Busch
    Signed-off-by: Jens Axboe

    Keith Busch
     
  • In commit c4004b02f8e5b ("x86: remove the kernel code/data/bss resources
    from /proc/iomem") I was hoping to remove the phyiscal kernel address
    data from /proc/iomem entirely, but that had to be reverted because some
    system programs actually use it.

    This limits all the detailed resource information to properly
    credentialed users instead.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • The PCI config access checked the file capabilities correctly, but used
    the itnernal security capability check rather than the helper function
    that is actually meant for that.

    The security_capable() has unusual return values and is not meant to be
    used elsewhere (the only other use is in the capability checking
    functions that we actually intend people to use, and this odd PCI usage
    really stood out when looking around the capability code.

    Signed-off-by: Linus Torvalds

    Linus Torvalds