03 Oct, 2016

16 commits

  • Signed-off-by: Yan, Zheng

    Yan, Zheng
     
  • Accessing / causes failuire if the client has caps that restrict path

    Signed-off-by: Yan, Zheng

    Yan, Zheng
     
  • Signed-off-by: Yan, Zheng

    Yan, Zheng
     
  • If O_DIRECT writes are racing with buffered writes, then
    the call to invalidate_inode_pages2_range() can call ceph_releasepage()
    on dirty pages.

    Most filesystems hold inode_lock() across O_DIRECT writes so they do not
    suffer this race, but cephfs deliberately drops the lock, and opens a window
    for the race.

    This race can be triggered with the generic/036 test from the xfstests
    test suite. It doesn't happen every time, but it does happen often.

    As the possibilty is expected, remove the warning, and instead include
    the PageDirty() status in the debug message.

    Signed-off-by: NeilBrown
    Reviewed-by: Jeff Layton
    Reviewed-by: Yan, Zheng

    NeilBrown
     
  • This call can fail if there are dirty pages. The preceding call to
    filemap_write_and_wait_range() will normally remove dirty pages, but
    as inode_lock() is not held over calls to ceph_direct_read_write(), it
    could race with non-direct writes and pages could be dirtied
    immediately after filemap_write_and_wait_range() returns

    If there are dirty pages, they will be removed by the subsequent call
    to truncate_inode_pages_range(), so having them here is not a problem.

    If the 'ret' value is left holding an error, then in the async IO case
    (aio_req is not NULL) the loop that would normally call
    ceph_osdc_start_request() will see the error in 'ret' and abort all
    requests. This doesn't seem like correct behaviour.

    So use separate 'ret2' instead of overloading 'ret'.

    Signed-off-by: NeilBrown
    Reviewed-by: Jeff Layton
    Reviewed-by: Yan, Zheng

    NeilBrown
     
  • If start_page() fails to add a page to page cache or fails to send
    OSD request. It should cal put_page() (instead of free_page()) for
    relevant pages.

    Besides, start_page() need to cancel fscache readpage if it fails
    to send OSD request.

    Signed-off-by: Yan, Zheng
    Reported-by: Zhi Zhang

    Yan, Zheng
     
  • Pull setting an error and marking a request done code into a new
    helper. obj_request_img_data_test() check isn't strictly needed right
    now, but makes it applicable to !img_data requests and a bit safer.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Move the check into rbd_obj_request_destroy() to avoid use-after-free
    on errors in rbd_img_request_fill(..., OBJ_REQUEST_PAGES, ...), where
    pages, owned by the caller, gets freed in rbd_img_request_fill().

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder
    Reviewed-by: David Disseldorp

    Ilya Dryomov
     
  • Accessing obj_request->img_request union field is only valid for object
    requests associated with an image (i.e. if obj_request_img_data_test()
    returns true). rbd_osd_req_format_read() used to do more, but now it
    just sets osd_req->snap_id. Standalone and stat object requests always
    go to the HEAD revision and are fine with CEPH_NOSNAP set by libceph,
    so get around the invalid union field use by simply not calling
    rbd_osd_req_format_read() in those places.

    Reported-by: David Disseldorp
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder
    Reviewed-by: David Disseldorp

    Ilya Dryomov
     
  • - don't put obj_request before rbd_obj_request_get() if
    rbd_obj_request_create() fails
    - don't leak pages if rbd_obj_request_create() fails
    - don't leak stat_request if rbd_osd_req_create() fails

    Reported-by: David Disseldorp
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder
    Reviewed-by: David Disseldorp

    Ilya Dryomov
     
  • - fix parent_length == img_request->xferred assert to not fire on
    copyup read failures
    - don't leak pages if copyup read fails or we can't allocate a new osd
    request

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder
    Reviewed-by: David Disseldorp

    Ilya Dryomov
     
  • Commit 0f2d5be792b0 ("rbd: use reference counts for image requests")
    added rbd_img_request_get(), which rbd_img_request_fill() calls for
    each obj_request added to img_request. It was an urgent band-aid for
    the uglyness that is rbd_img_obj_callback() and none of the error paths
    were updated.

    Given that this img_request reference is meant to represent an
    obj_request that hasn't passed through rbd_img_obj_callback() yet,
    proper cleanup in appropriate destructors is a challenge. However,
    noting that if we don't get a chance to call rbd_obj_request_complete(),
    there is not going to be a call to rbd_img_obj_callback(), we can move
    rbd_img_request_get() into rbd_obj_request_submit() and fixup the two
    places that call rbd_obj_request_complete() directly and not through
    rbd_obj_request_submit() to temporarily bump img_request, so that
    rbd_img_obj_callback() can put as usual.

    This takes care of img_request leaks on errors on the submit side.

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder

    Ilya Dryomov
     
  • If stat request fails with something other than -ENOENT (which just
    means that we need to copyup), the original object request is never
    marked as done and therefore never completed. Fix this by moving the
    mark done + complete snippet from rbd_img_obj_parent_read_full() into
    rbd_img_obj_exists_callback(). The former remains covered, as the
    latter is its only caller (through rbd_img_obj_request_submit()).

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder
    Reviewed-by: David Disseldorp

    Ilya Dryomov
     
  • Assert once in rbd_img_obj_request_submit().

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder
    Reviewed-by: David Disseldorp

    Ilya Dryomov
     
  • - osdc parameter is useless
    - starting with commit 5aea3dcd5021 ("libceph: a major OSD client
    update"), ceph_osdc_start_request() always returns success

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Alex Elder
    Reviewed-by: David Disseldorp

    Ilya Dryomov
     
  • Add a per-device option to acquire exclusive lock on reads (in addition
    to writes and discards). The use case is iSCSI, where it will be used
    to prevent execution of stale writes after the implicit failover.

    Signed-off-by: Ilya Dryomov
    Tested-by: Mike Christie

    Ilya Dryomov
     

25 Aug, 2016

16 commits


15 Aug, 2016

3 commits

  • Linus Torvalds
     
  • Pull thermal updates from Zhang Rui:

    - Fix a race condition when updating cooling device, which may lead to
    a situation where a thermal governor never updates the cooling
    device. From Michele Di Giorgio.

    - Fix a zero division error when disabling the forced idle injection
    from the intel powerclamp. From Petr Mladek.

    - Add suspend/resume callback for intel_pch_thermal thermal driver.
    From Srinivas Pandruvada.

    - Another two fixes for clocking cooling driver and hwmon sysfs I/F.
    From Michele Di Giorgio and Kuninori Morimoto.

    [ Hmm. That suspend/resume callback for intel_pch_thermal doesn't look
    like a fix, but I'm letting it slide.. - Linus ]

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
    thermal: clock_cooling: Fix missing mutex_init()
    thermal: hwmon: EXPORT_SYMBOL_GPL for thermal hwmon sysfs
    thermal: fix race condition when updating cooling device
    thermal/powerclamp: Prevent division by zero when counting interval
    thermal: intel_pch_thermal: Add suspend/resume callback

    Linus Torvalds
     
  • Pull m68knommu fix from Greg Ungerer:
    "This contains only a single fix for a register corruption problem on
    certain types of m68k flat format binaries"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
    m68knommu: fix user a5 register being overwritten

    Linus Torvalds
     

14 Aug, 2016

4 commits

  • …/groeck/linux-staging

    Pull h8300 and unicore32 architecture fixes from Guenter Roeck:
    "Two patches to fix h8300 and unicore32 builds.

    unicore32 builds have been broken since v4.6. The fix has been
    available in -next since March of this year.

    h8300 builds have been broken since the last commit window. The fix
    has been available in -next since June of this year"

    * tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    h8300: Add missing include file to asm/io.h
    unicore32: mm: Add missing parameter to arch_vma_access_permitted

    Linus Torvalds
     
  • Pull arm64 fixes from Catalin Marinas:

    - support for nr_cpus= command line argument (maxcpus was previously
    changed to allow secondary CPUs to be hot-plugged)

    - ARM PMU interrupt handling fix

    - fix potential TLB conflict in the hibernate code

    - improved handling of EL1 instruction aborts (better error reporting)

    - removal of useless jprobes code for stack saving/restoring

    - defconfig updates

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO
    arm64: defconfig: add options for virtualization and containers
    arm64: hibernate: handle allocation failures
    arm64: hibernate: avoid potential TLB conflict
    arm64: Handle el1 synchronous instruction aborts cleanly
    arm64: Remove stack duplicating code from jprobes
    drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property
    drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock
    arm64: Support hard limit of cpu count by nr_cpus

    Linus Torvalds
     
  • Pull KVM fixes from Radim Krčmář:
    "KVM:
    - lock kvm_device list to prevent corruption on device creation.

    PPC:
    - split debugfs initialization from creation of the xics device to
    unlock the newly taken kvm lock earlier.

    s390:
    - prevent userspace from triggering two WARN_ON_ONCE.

    MIPS:
    - fix several issues in the management of TLB faults (Cc: stable)"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    MIPS: KVM: Propagate kseg0/mapped tlb fault errors
    MIPS: KVM: Fix gfn range check in kseg0 tlb faults
    MIPS: KVM: Add missing gfn range check
    MIPS: KVM: Fix mapped fault broken commpage handling
    KVM: Protect device ops->create and list_add with kvm->lock
    KVM: PPC: Move xics_debugfs_init out of create
    KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
    KVM: s390: set the prefix initially properly

    Linus Torvalds
     
  • Pull block fixes from Jens Axboe:

    - an NVMe fix from Gabriel, fixing a suspend/resume issue on some
    setups

    - addition of a few missing entries in the block queue sysfs
    documentation, from Joe

    - a fix for a sparse shadow warning for the bvec iterator, from
    Johannes

    - a writeback deadlock involving raid issuing barriers, and not
    flushing the plug when we wakeup the flusher threads. From
    Konstantin

    - a set of patches for the NVMe target/loop/rdma code, from Roland and
    Sagi

    * 'for-linus' of git://git.kernel.dk/linux-block:
    bvec: avoid variable shadowing warning
    doc: update block/queue-sysfs.txt entries
    nvme: Suspend all queues before deletion
    mm, writeback: flush plugged IO in wakeup_flusher_threads()
    nvme-rdma: Remove unused includes
    nvme-rdma: start async event handler after reconnecting to a controller
    nvmet: Fix controller serial number inconsistency
    nvmet-rdma: Don't use the inline buffer in order to avoid allocation for small reads
    nvmet-rdma: Correctly handle RDMA device hot removal
    nvme-rdma: Make sure to shutdown the controller if we can
    nvme-loop: Remove duplicate call to nvme_remove_namespaces
    nvme-rdma: Free the I/O tags when we delete the controller
    nvme-rdma: Remove duplicate call to nvme_remove_namespaces
    nvme-rdma: Fix device removal handling
    nvme-rdma: Queue ns scanning after a sucessful reconnection
    nvme-rdma: Don't leak uninitialized memory in connect request private data

    Linus Torvalds
     

13 Aug, 2016

1 commit

  • h8300 builds fail with

    arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’
    arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’
    arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’

    and many related errors.

    Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix")
    Cc: Andrew Morton
    Signed-off-by: Guenter Roeck

    Guenter Roeck