11 Nov, 2015

2 commits

  • Pull block IO poll support from Jens Axboe:
    "Various groups have been doing experimentation around IO polling for
    (really) fast devices. The code has been reviewed and has been
    sitting on the side for a few releases, but this is now good enough
    for coordinated benchmarking and further experimentation.

    Currently O_DIRECT sync read/write are supported. A framework is in
    the works that allows scalable stats tracking so we can auto-tune
    this. And we'll add libaio support as well soon. Fow now, it's an
    opt-in feature for test purposes"

    * 'for-4.4/io-poll' of git://git.kernel.dk/linux-block:
    direct-io: be sure to assign dio->bio_bdev for both paths
    directio: add block polling support
    NVMe: add blk polling support
    block: add block polling support
    blk-mq: return tag/queue combo in the make_request_fn handlers
    block: change ->make_request_fn() and users to return a queue cookie

    Linus Torvalds
     
  • Pull config fix for md from Neil Brown:
    "New config dependency needed as md/raid5 now uses crc32c"

    * tag 'md/4.4-rc0-fix' of git://neil.brown.name/md:
    raid5-cache: add crc32c Kconfig dependency

    Linus Torvalds
     

09 Nov, 2015

1 commit

  • The recent change of the raid5-cache code to use crc32c instead
    of crc32 causes link errors when CONFIG_LIBCRC32C is disabled:

    drivers/built-in.o: In function crc32c'
    core.c:(.text+0x1c6060): undefined reference to `crc32c'

    This adds an explicit 'select' statement like all other users
    of this function do.

    Signed-off-by: Arnd Bergmann
    Fixes: 5cb2fbd6ea0d ("raid5-cache: use crc32c checksum")
    Signed-off-by: NeilBrown

    Arnd Bergmann
     

08 Nov, 2015

3 commits

  • Merge second patch-bomb from Andrew Morton:

    - most of the rest of MM

    - procfs

    - lib/ updates

    - printk updates

    - bitops infrastructure tweaks

    - checkpatch updates

    - nilfs2 update

    - signals

    - various other misc bits: coredump, seqfile, kexec, pidns, zlib, ipc,
    dma-debug, dma-mapping, ...

    * emailed patches from Andrew Morton : (102 commits)
    ipc,msg: drop dst nil validation in copy_msg
    include/linux/zutil.h: fix usage example of zlib_adler32()
    panic: release stale console lock to always get the logbuf printed out
    dma-debug: check nents in dma_sync_sg*
    dma-mapping: tidy up dma_parms default handling
    pidns: fix set/getpriority and ioprio_set/get in PRIO_USER mode
    kexec: use file name as the output message prefix
    fs, seqfile: always allow oom killer
    seq_file: reuse string_escape_str()
    fs/seq_file: use seq_* helpers in seq_hex_dump()
    coredump: change zap_threads() and zap_process() to use for_each_thread()
    coredump: ensure all coredumping tasks have SIGNAL_GROUP_COREDUMP
    signal: remove jffs2_garbage_collect_thread()->allow_signal(SIGCONT)
    signal: introduce kernel_signal_stop() to fix jffs2_garbage_collect_thread()
    signal: turn dequeue_signal_lock() into kernel_dequeue_signal()
    signals: kill block_all_signals() and unblock_all_signals()
    nilfs2: fix gcc uninitialized-variable warnings in powerpc build
    nilfs2: fix gcc unused-but-set-variable warnings
    MAINTAINERS: nilfs2: add header file for tracing
    nilfs2: add tracepoints for analyzing reading and writing metadata files
    ...

    Linus Torvalds
     
  • Pull trivial updates from Jiri Kosina:
    "Trivial stuff from trivial tree that can be trivially summed up as:

    - treewide drop of spurious unlikely() before IS_ERR() from Viresh
    Kumar

    - cosmetic fixes (that don't really affect basic functionality of the
    driver) for pktcdvd and bcache, from Julia Lawall and Petr Mladek

    - various comment / printk fixes and updates all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    bcache: Really show state of work pending bit
    hwmon: applesmc: fix comment typos
    Kconfig: remove comment about scsi_wait_scan module
    class_find_device: fix reference to argument "match"
    debugfs: document that debugfs_remove*() accepts NULL and error values
    net: Drop unlikely before IS_ERR(_OR_NULL)
    mm: Drop unlikely before IS_ERR(_OR_NULL)
    fs: Drop unlikely before IS_ERR(_OR_NULL)
    drivers: net: Drop unlikely before IS_ERR(_OR_NULL)
    drivers: misc: Drop unlikely before IS_ERR(_OR_NULL)
    UBI: Update comments to reflect UBI_METAONLY flag
    pktcdvd: drop null test before destroy functions

    Linus Torvalds
     
  • No functional changes in this patch, but it prepares us for returning
    a more useful cookie related to the IO that was queued up.

    Signed-off-by: Jens Axboe
    Acked-by: Christoph Hellwig
    Acked-by: Keith Busch

    Jens Axboe
     

07 Nov, 2015

1 commit

  • …d avoiding waking kswapd

    __GFP_WAIT has been used to identify atomic context in callers that hold
    spinlocks or are in interrupts. They are expected to be high priority and
    have access one of two watermarks lower than "min" which can be referred
    to as the "atomic reserve". __GFP_HIGH users get access to the first
    lower watermark and can be called the "high priority reserve".

    Over time, callers had a requirement to not block when fallback options
    were available. Some have abused __GFP_WAIT leading to a situation where
    an optimisitic allocation with a fallback option can access atomic
    reserves.

    This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
    cannot sleep and have no alternative. High priority users continue to use
    __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
    are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
    callers that want to wake kswapd for background reclaim. __GFP_WAIT is
    redefined as a caller that is willing to enter direct reclaim and wake
    kswapd for background reclaim.

    This patch then converts a number of sites

    o __GFP_ATOMIC is used by callers that are high priority and have memory
    pools for those requests. GFP_ATOMIC uses this flag.

    o Callers that have a limited mempool to guarantee forward progress clear
    __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
    into this category where kswapd will still be woken but atomic reserves
    are not used as there is a one-entry mempool to guarantee progress.

    o Callers that are checking if they are non-blocking should use the
    helper gfpflags_allow_blocking() where possible. This is because
    checking for __GFP_WAIT as was done historically now can trigger false
    positives. Some exceptions like dm-crypt.c exist where the code intent
    is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
    flag manipulations.

    o Callers that built their own GFP flags instead of starting with GFP_KERNEL
    and friends now also need to specify __GFP_KSWAPD_RECLAIM.

    The first key hazard to watch out for is callers that removed __GFP_WAIT
    and was depending on access to atomic reserves for inconspicuous reasons.
    In some cases it may be appropriate for them to use __GFP_HIGH.

    The second key hazard is callers that assembled their own combination of
    GFP flags instead of starting with something like GFP_KERNEL. They may
    now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
    if it's missed in most cases as other activity will wake kswapd.

    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Vitaly Wool <vitalywool@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Mel Gorman
     

06 Nov, 2015

1 commit

  • WORK_STRUCT_PENDING is a mask for testing the pending bit.
    test_bit() expects the number of the bit and we need to
    use WORK_STRUCT_PENDING_BIT there.

    Also work_data_bits() is defined in workqueues.h now.

    I have noticed this just by chance when looking how
    WORK_STRUCT_PENDING_BIT is used. The change is compile
    tested.

    Signed-off-by: Petr Mladek
    Signed-off-by: Jiri Kosina

    Petr Mladek
     

05 Nov, 2015

3 commits

  • Pull device mapper updates from Mike Snitzer:
    "Smaller set of DM changes for this merge. I've based these changes on
    Jens' for-4.4/reservations branch because the associated DM changes
    required it.

    - Revert a dm-multipath change that caused a regression for
    unprivledged users (e.g. kvm guests) that issued ioctls when a
    multipath device had no available paths.

    - Include Christoph's refactoring of DM's ioctl handling and add
    support for passing through persistent reservations with DM
    multipath.

    - All other changes are very simple cleanups"

    * tag 'dm-4.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    dm switch: simplify conditional in alloc_region_table()
    dm delay: document that offsets are specified in sectors
    dm delay: capitalize the start of an delay_ctr() error message
    dm delay: Use DM_MAPIO macros instead of open-coded equivalents
    dm linear: remove redundant target name from error messages
    dm persistent data: eliminate unnecessary return values
    dm: eliminate unused "bioset" process for each bio-based DM device
    dm: convert ffs to __ffs
    dm: drop NULL test before kmem_cache_destroy() and mempool_destroy()
    dm: add support for passing through persistent reservations
    dm: refactor ioctl handling
    Revert "dm mpath: fix stalls when handling invalid ioctls"
    dm: initialize non-blk-mq queue data before queue is used

    Linus Torvalds
     
  • Pull md updates from Neil Brown:
    "Two major components to this update.

    1) The clustered-raid1 support from SUSE is nearly complete. There
    are a few outstanding issues being worked on. Maybe half a dozen
    patches will bring this to a usable state.

    2) The first stage of journalled-raid5 support from Facebook makes an
    appearance. With a journal device configured (typically NVRAM or
    SSD), the "RAID5 write hole" should be closed - a crash during
    degraded operations cannot result in data corruption.

    The next stage will be to use the journal as a write-behind cache
    so that latency can be reduced and in some cases throughput
    increased by performing more full-stripe writes.

    * tag 'md/4.4' of git://neil.brown.name/md: (66 commits)
    MD: when RAID journal is missing/faulty, block RESTART_ARRAY_RW
    MD: set journal disk ->raid_disk
    MD: kick out journal disk if it's not fresh
    raid5-cache: start raid5 readonly if journal is missing
    MD: add new bit to indicate raid array with journal
    raid5-cache: IO error handling
    raid5: journal disk can't be removed
    raid5-cache: add trim support for log
    MD: fix info output for journal disk
    raid5-cache: use bio chaining
    raid5-cache: small log->seq cleanup
    raid5-cache: new helper: r5_reserve_log_entry
    raid5-cache: inline r5l_alloc_io_unit into r5l_new_meta
    raid5-cache: take rdev->data_offset into account early on
    raid5-cache: refactor bio allocation
    raid5-cache: clean up r5l_get_meta
    raid5-cache: simplify state machine when caches flushes are not needed
    raid5-cache: factor out a helper to run all stripes for an I/O unit
    raid5-cache: rename flushed_ios to finished_ios
    raid5-cache: free I/O units earlier
    ...

    Linus Torvalds
     
  • Pull block integrity updates from Jens Axboe:
    ""This is the joint work of Dan and Martin, cleaning up and improving
    the support for block data integrity"

    * 'for-4.4/integrity' of git://git.kernel.dk/linux-block:
    block, libnvdimm, nvme: provide a built-in blk_integrity nop profile
    block: blk_flush_integrity() for bio-based drivers
    block: move blk_integrity to request_queue
    block: generic request_queue reference counting
    nvme: suspend i/o during runtime blk_integrity_unregister
    md: suspend i/o during runtime blk_integrity_unregister
    md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown
    block: Inline blk_integrity in struct gendisk
    block: Export integrity data interval size in sysfs
    block: Reduce the size of struct blk_integrity
    block: Consolidate static integrity profile properties
    block: Move integrity kobject to struct gendisk

    Linus Torvalds
     

01 Nov, 2015

29 commits