14 Oct, 2014

2 commits


09 Oct, 2014

1 commit


19 Aug, 2014

4 commits

  • Most places which allocate an r10_bio zero the ->state, some don't.
    As the r10_bio comes from a mempool, and the allocation function uses
    kzalloc it is often zero anyway. But sometimes it isn't and it is
    best to be safe.

    I only noticed this because of the bug fixed by an earlier patch
    where the r10_bios allocated for a reshape were left around to
    be used by a subsequent resync. In that case the R10BIO_IsReshape
    flag caused problems.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • If raid10 reshape fails to find somewhere to read a block
    from, it returns without freeing memory...

    Signed-off-by: NeilBrown

    NeilBrown
     
  • When a raid10 commences a resync/recovery/reshape it allocates
    some buffer space.
    When a resync/recovery completes the buffer space is freed. But not
    when the reshape completes.
    This can result in a small memory leak.

    There is a subtle side-effect of this bug. When a RAID10 is reshaped
    to a larger array (more devices), the reshape is immediately followed
    by a "resync" of the new space. This "resync" will use the buffer
    space which was allocated for "reshape". This can cause problems
    including a "BUG" in the SCSI layer. So this is suitable for -stable.

    Cc: stable@vger.kernel.org (v3.5+)
    Fixes: 3ea7daa5d7fde47cd41f4d56c2deb949114da9d6
    Signed-off-by: NeilBrown

    NeilBrown
     
  • raid10 reshape clears unwanted bits from a bio->bi_flags using
    a method which, while clumsy, worked until 3.10 when BIO_OWNS_VEC
    was added.
    Since then it clears that bit but shouldn't. This results in a
    memory leak.

    So change to used the approved method of clearing unwanted bits.

    As this causes a memory leak which can consume all of memory
    the fix is suitable for -stable.

    Fixes: a38352e0ac02dbbd4fa464dc22d1352b5fbd06fd
    Cc: stable@vger.kernel.org (v3.10+)
    Reported-by: mdraid.pkoch@dfgh.net (Peter Koch)
    Signed-off-by: NeilBrown

    NeilBrown
     

31 Jul, 2014

1 commit

  • Currently we don't abort recovery on a write error if the write error
    to the recovering device was triggerd by normal IO (as opposed to
    recovery IO).

    This means that for one bitmap region, the recovery might write to the
    recovering device for a few sectors, then not bother for subsequent
    sectors (as it never writes to failed devices). In this case
    the bitmap bit will be cleared, but it really shouldn't.

    The result is that if the recovering device fails and is then re-added
    (after fixing whatever hardware problem triggerred the failure),
    the second recovery won't redo the region it was in the middle of,
    so some of the device will not be recovered properly.

    If we abort the recovery, the region being processes will be cancelled
    (bit not cleared) and the whole region will be retried.

    As the bug can result in data corruption the patch is suitable for
    -stable. For kernels prior to 3.11 there is a conflict in raid10.c
    which will require care.

    Original-from: jiao hui
    Reported-and-tested-by: jiao hui
    Signed-off-by: NeilBrown
    Cc: stable@vger.kernel.org

    NeilBrown
     

06 May, 2014

1 commit

  • wait_barrier() includes a counter, so we must call it precisely once
    (unless balanced by allow_barrier()) for each request submitted.

    Since
    commit 20d0189b1012a37d2533a87fb451f7852f2418d1
    block: Introduce new bio_split()
    in 3.14-rc1, we don't call it for the extra requests generated when
    we need to split a bio.

    When this happens the counter goes negative, any resync/recovery will
    never start, and "mdadm --stop" will hang.

    Reported-by: Chris Murphy
    Fixes: 20d0189b1012a37d2533a87fb451f7852f2418d1
    Cc: stable@vger.kernel.org (3.14+)
    Cc: Kent Overstreet
    Signed-off-by: NeilBrown

    NeilBrown
     

31 Jan, 2014

1 commit

  • Pull core block IO changes from Jens Axboe:
    "The major piece in here is the immutable bio_ve series from Kent, the
    rest is fairly minor. It was supposed to go in last round, but
    various issues pushed it to this release instead. The pull request
    contains:

    - Various smaller blk-mq fixes from different folks. Nothing major
    here, just minor fixes and cleanups.

    - Fix for a memory leak in the error path in the block ioctl code
    from Christian Engelmayer.

    - Header export fix from CaiZhiyong.

    - Finally the immutable biovec changes from Kent Overstreet. This
    enables some nice future work on making arbitrarily sized bios
    possible, and splitting more efficient. Related fixes to immutable
    bio_vecs:

    - dm-cache immutable fixup from Mike Snitzer.
    - btrfs immutable fixup from Muthu Kumar.

    - bio-integrity fix from Nic Bellinger, which is also going to stable"

    * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits)
    xtensa: fixup simdisk driver to work with immutable bio_vecs
    block/blk-mq-cpu.c: use hotcpu_notifier()
    blk-mq: for_each_* macro correctness
    block: Fix memory leak in rw_copy_check_uvector() handling
    bio-integrity: Fix bio_integrity_verify segment start bug
    block: remove unrelated header files and export symbol
    blk-mq: uses page->list incorrectly
    blk-mq: use __smp_call_function_single directly
    btrfs: fix missing increment of bi_remaining
    Revert "block: Warn and free bio if bi_end_io is not set"
    block: Warn and free bio if bi_end_io is not set
    blk-mq: fix initializing request's start time
    block: blk-mq: don't export blk_mq_free_queue()
    block: blk-mq: make blk_sync_queue support mq
    block: blk-mq: support draining mq queue
    dm cache: increment bi_remaining when bi_end_io is restored
    block: fixup for generic bio chaining
    block: Really silence spurious compiler warnings
    block: Silence spurious compiler warnings
    block: Kill bio_pair_split()
    ...

    Linus Torvalds
     

14 Jan, 2014

3 commits

  • This is the raid10 equivalent of

    commit 4f0a5e012cf41321d611e7cad63e1017d143d138
    MD RAID1: Further conditionalize 'fullsync'

    If a device in a newly assembled array is not fully recovered we
    currently do a fully resync by don't need to.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • commit e875ecea266a543e643b19e44cf472f1412708f9
    md/raid10 record bad blocks as needed during recovery.

    added code to the "cannot recover this block" path to record a bad
    block rather than fail the whole recovery.
    Unfortunately this new case was placed *after* r10bio was freed rather
    than *before*, yet it still uses r10bio.
    This is will crash with a null dereference.

    So move the freeing of r10bio down where it is safe.

    Cc: stable@vger.kernel.org (v3.1+)
    Fixes: e875ecea266a543e643b19e44cf472f1412708f9
    Reported-by: Damian Nowak
    URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181
    Signed-off-by: NeilBrown

    NeilBrown
     
  • If we discover a bad block when reading we split the request and
    potentially read some of it from a different device.

    The code path of this has two bugs in RAID10.
    1/ we get a spin_lock with _irq, but unlock without _irq!!
    2/ The calculation of 'sectors_handled' is wrong, as can be clearly
    seen by comparison with raid1.c

    This leads to at least 2 warnings and a probable crash is a RAID10
    ever had known bad blocks.

    Cc: stable@vger.kernel.org (v3.1+)
    Fixes: 856e08e23762dfb92ffc68fd0a8d228f9e152160
    Reported-by: Damian Nowak
    URL: https://bugzilla.kernel.org/show_bug.cgi?id=68181
    Signed-off-by: NeilBrown

    NeilBrown
     

24 Nov, 2013

4 commits

  • The new bio_split() can split arbitrary bios - it's not restricted to
    single page bios, like the old bio_split() (previously renamed to
    bio_pair_split()). It also has different semantics - it doesn't allocate
    a struct bio_pair, leaving it up to the caller to handle completions.

    Then convert the existing bio_pair_split() users to the new bio_split()
    - and also nvme, which was open coding bio splitting.

    (We have to take that BUG_ON() out of bio_integrity_trim() because this
    bio_split() needs to use it, and there's no reason it has to be used on
    bios marked as cloned; BIO_CLONED doesn't seem to have clearly
    documented semantics anyways.)

    Signed-off-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Martin K. Petersen
    Cc: Matthew Wilcox
    Cc: Keith Busch
    Cc: Vishal Verma
    Cc: Jiri Kosina
    Cc: Neil Brown

    Kent Overstreet
     
  • This is prep work for introducing a more general bio_split().

    Signed-off-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: NeilBrown
    Cc: Alasdair Kergon
    Cc: Lars Ellenberg
    Cc: Peter Osterlund
    Cc: Sage Weil

    Kent Overstreet
     
  • When we start sharing biovecs, keeping bi_vcnt accurate for splits is
    going to be error prone - and unnecessary, if we refactor some code.

    So bio_segments() has to go - but most of the existing users just needed
    to know if the bio had multiple segments, which is easier - add a
    bio_multiple_segments() for them.

    (Two of the current uses of bio_segments() are going to go away in a
    couple patches, but the current implementation of bio_segments() is
    unsafe as soon as we start doing driver conversions for immutable
    biovecs - so implement a dumb version for bisectability, it'll go away
    in a couple patches)

    Signed-off-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Neil Brown
    Cc: Nagalakshmi Nandigama
    Cc: Sreekanth Reddy
    Cc: "James E.J. Bottomley"

    Kent Overstreet
     
  • Immutable biovecs are going to require an explicit iterator. To
    implement immutable bvecs, a later patch is going to add a bi_bvec_done
    member to this struct; for now, this patch effectively just renames
    things.

    Signed-off-by: Kent Overstreet
    Cc: Jens Axboe
    Cc: Geert Uytterhoeven
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Ed L. Cashin"
    Cc: Nick Piggin
    Cc: Lars Ellenberg
    Cc: Jiri Kosina
    Cc: Matthew Wilcox
    Cc: Geoff Levand
    Cc: Yehuda Sadeh
    Cc: Sage Weil
    Cc: Alex Elder
    Cc: ceph-devel@vger.kernel.org
    Cc: Joshua Morris
    Cc: Philip Kelleher
    Cc: Rusty Russell
    Cc: "Michael S. Tsirkin"
    Cc: Konrad Rzeszutek Wilk
    Cc: Jeremy Fitzhardinge
    Cc: Neil Brown
    Cc: Alasdair Kergon
    Cc: Mike Snitzer
    Cc: dm-devel@redhat.com
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: linux390@de.ibm.com
    Cc: Boaz Harrosh
    Cc: Benny Halevy
    Cc: "James E.J. Bottomley"
    Cc: Greg Kroah-Hartman
    Cc: "Nicholas A. Bellinger"
    Cc: Alexander Viro
    Cc: Chris Mason
    Cc: "Theodore Ts'o"
    Cc: Andreas Dilger
    Cc: Jaegeuk Kim
    Cc: Steven Whitehouse
    Cc: Dave Kleikamp
    Cc: Joern Engel
    Cc: Prasad Joshi
    Cc: Trond Myklebust
    Cc: KONISHI Ryusuke
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Ben Myers
    Cc: xfs@oss.sgi.com
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Len Brown
    Cc: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Cc: Herton Ronaldo Krzesinski
    Cc: Ben Hutchings
    Cc: Andrew Morton
    Cc: Guo Chao
    Cc: Tejun Heo
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Wei Yongjun
    Cc: "Roger Pau Monné"
    Cc: Jan Beulich
    Cc: Stefano Stabellini
    Cc: Ian Campbell
    Cc: Sebastian Ott
    Cc: Christian Borntraeger
    Cc: Minchan Kim
    Cc: Jiang Liu
    Cc: Nitin Gupta
    Cc: Jerome Marchand
    Cc: Joe Perches
    Cc: Peng Tao
    Cc: Andy Adamson
    Cc: fanchaoting
    Cc: Jie Liu
    Cc: Sunil Mushran
    Cc: "Martin K. Petersen"
    Cc: Namjae Jeon
    Cc: Pankaj Kumar
    Cc: Dan Magenheimer
    Cc: Mel Gorman 6

    Kent Overstreet
     

21 Nov, 2013

1 commit

  • Pull md update from Neil Brown:
    "Mostly optimisations and obscure bug fixes.
    - raid5 gets less lock contention
    - raid1 gets less contention between normal-io and resync-io during
    resync"

    * tag 'md/3.13' of git://neil.brown.name/md:
    md/raid5: Use conf->device_lock protect changing of multi-thread resources.
    md/raid5: Before freeing old multi-thread worker, it should flush them.
    md/raid5: For stripe with R5_ReadNoMerge, we replace REQ_FLUSH with REQ_NOMERGE.
    UAPI: include in linux/raid/md_p.h
    raid1: Rewrite the implementation of iobarrier.
    raid1: Add some macros to make code clearly.
    raid1: Replace raise_barrier/lower_barrier with freeze_array/unfreeze_array when reconfiguring the array.
    raid1: Add a field array_frozen to indicate whether raid in freeze state.
    md: Convert use of typedef ctl_table to struct ctl_table
    md/raid5: avoid deadlock when raid5 array has unack badblocks during md_stop_writes.
    md: use MD_RECOVERY_INTR instead of kthread_should_stop in resync thread.
    md: fix some places where mddev_lock return value is not checked.
    raid5: Retry R5_ReadNoMerge flag when hit a read error.
    raid5: relieve lock contention in get_active_stripe()
    raid5: relieve lock contention in get_active_stripe()
    wait: add wait_event_cmd()
    md/raid5.c: add proper locking to error path of raid5_start_reshape.
    md: fix calculation of stacking limits on level change.
    raid5: Use slow_path to release stripe when mddev->thread is null

    Linus Torvalds
     

19 Nov, 2013

1 commit

  • We currently use kthread_should_stop() in various places in the
    sync/reshape code to abort early.
    However some places set MD_RECOVERY_INTR but don't immediately call
    md_reap_sync_thread() (and we will shortly get another one).
    When this happens we are relying on md_check_recovery() to reap the
    thread and that only happen when it finishes normally.
    So MD_RECOVERY_INTR must lead to a normal finish without the
    kthread_should_stop() test.

    So replace all relevant tests, and be more careful when the thread is
    interrupted not to acknowledge that latest step in a reshape as it may
    not be fully committed yet.

    Also add a test on MD_RECOVERY_INTR in the 'is_mddev_idle' loop
    so we don't wait have to wait for the speed to drop before we can abort.

    Signed-off-by: NeilBrown

    NeilBrown
     

09 Nov, 2013

1 commit


24 Oct, 2013

1 commit

  • Since:
    commit 7ceb17e87bde79d285a8b988cfed9eaeebe60b86
    md: Allow devices to be re-added to a read-only array.

    spares are activated on a read-only array. In case of raid1 and raid10
    personalities it causes that not-in-sync devices are marked in-sync
    without checking if recovery has been finished.

    If a read-only array is degraded and one of its devices is not in-sync
    (because the array has been only partially recovered) recovery will be skipped.

    This patch adds checking if recovery has been finished before marking a device
    in-sync for raid1 and raid10 personalities. In case of raid5 personality
    such condition is already present (at raid5.c:6029).

    Bug was introduced in 3.10 and causes data corruption.

    Cc: stable@vger.kernel.org
    Signed-off-by: Pawel Baldysiak
    Signed-off-by: Lukasz Dorau
    Signed-off-by: NeilBrown

    Lukasz Dorau
     

25 Jul, 2013

1 commit

  • We always need to be careful when calling generic_make_request, as it
    can start a chain of events which might free something that we are
    using.

    Here is one place I wasn't careful enough. If the wbio2 is not in
    use, then it might get freed at the first generic_make_request call.
    So perform all necessary tests first.

    This bug was introduced in 3.3-rc3 (24afd80d99) and can cause an
    oops, so fix is suitable for any -stable since then.

    Cc: stable@vger.kernel.org (3.3+)
    Signed-off-by: NeilBrown

    NeilBrown
     

18 Jul, 2013

1 commit

  • 1/ When an different between blocks is found, data is copied from
    one bio to the other. However bv_len is used as the length to
    copy and this could be zero. So use r10_bio->sectors to calculate
    length instead.
    Using bv_len was probably always a bit dubious, but the introduction
    of bio_advance made it much more likely to be a problem.

    2/ When preparing some blocks for sync, we don't set BIO_UPTODATE
    except on bios that we schedule for a read. This ensures that
    missing/failed devices don't confuse the loop at the top of
    sync_request write.
    Commit 8be185f2c9d54d6 "raid10: Use bio_reset()"
    removed a loop which set BIO_UPTDATE on all appropriate bios.
    So we need to re-add that flag.

    These bugs were introduced in 3.10, so this patch is suitable for
    3.10-stable, and can remove a potential for data corruption.

    Cc: stable@vger.kernel.org (3.10)
    Reported-by: Brassow Jonathan
    Signed-off-by: NeilBrown

    NeilBrown
     

04 Jul, 2013

1 commit

  • The recent comment:
    commit 7e83ccbecd608b971f340e951c9e84cd0343002f
    md/raid10: Allow skipping recovery when clean arrays are assembled

    Causes raid10 to skip a recovery in certain cases where it is safe to
    do so. Unfortunately it also causes a reshape to be skipped which is
    never safe. The result is that an attempt to reshape a RAID10 will
    appear to complete instantly, but no data will have been moves so the
    array will now contain garbage.
    (If nothing is written, you can recovery by simple performing the
    reverse reshape which will also complete instantly).

    Bug was introduced in 3.10, so this is suitable for 3.10-stable.

    Cc: stable@vger.kernel.org (3.10)
    Cc: Martin Wilck
    Signed-off-by: NeilBrown

    NeilBrown
     

03 Jul, 2013

1 commit

  • 1/ If a RAID10 is being reshaped to a fewer number of devices
    and is stopped while this is ongoing, then when the array is
    reassembled the 'mirrors' array will be allocated too small.
    This will lead to an access error or memory corruption.

    2/ A sanity test for a reshaping RAID10 array is restarted
    is slightly incorrect.

    Due to the first bug, this is suitable for any -stable
    kernel since 3.5 where this code was introduced.

    Cc: stable@vger.kernel.org (v3.5+)
    Signed-off-by: NeilBrown

    NeilBrown
     

14 Jun, 2013

4 commits

  • It isn't really enough to check that the rdev is present, we need to
    also be sure that the device is still In_sync.

    Doing this requires using rcu_dereference to access the rdev, and
    holding the rcu_read_lock() to ensure the rdev doesn't disappear while
    we look at it.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • As 'enough' accesses conf->prev and conf->geo, which can change
    spontanously, it should guard against changes.
    This can be done with device_lock as start_reshape holds device_lock
    while updating 'geo' and end_reshape holds it while updating 'prev'.

    So 'error' needs to hold 'device_lock'.

    On the other hand, raid10_end_read_request knows which of the two it
    really wants to access, and as it is an active request on that one,
    the value cannot change underneath it.

    So change _enough to take flag rather than a pointer, pass the
    appropriate flag from raid10_end_read_request(), and remove the locking.

    All other calls to 'enough' are made with reconfig_mutex held, so
    neither 'prev' nor 'geo' can change.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • DM RAID: Add ability to restore transiently failed devices on resume

    This patch adds code to the resume function to check over the devices
    in the RAID array. If any are found to be marked as failed and their
    superblocks can be read, an attempt is made to reintegrate them into
    the array. This allows the user to refresh the array with a simple
    suspend and resume of the array - rather than having to load a
    completely new table, allocate and initialize all the structures and
    throw away the old instantiation.

    Signed-off-by: Jonathan Brassow
    Signed-off-by: NeilBrown

    Jonathan Brassow
     
  • Pull md bugfixes from Neil Brown:
    "A few bugfixes for md

    Some tagged for -stable"

    * tag 'md-3.10-fixes' of git://neil.brown.name/md:
    md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place
    md/raid1,raid10: use freeze_array in place of raise_barrier in various places.
    md/raid1: consider WRITE as successful only if at least one non-Faulty and non-rebuilding drive completed it.
    md: md_stop_writes() should always freeze recovery.

    Linus Torvalds
     

13 Jun, 2013

3 commits

  • There are cases where the kernel will believe that the WRITE SAME
    command is supported by a block device which does not, in fact,
    support WRITE SAME. This currently happens for SATA drivers behind a
    SAS controller, but there are probably a hundred other ways that can
    happen, including drive firmware bugs.

    After receiving an error for WRITE SAME the block layer will retry the
    request as a plain write of zeroes, but mdraid will consider the
    failure as fatal and consider the drive failed. This has the effect
    that all the mirrors containing a specific set of data are each
    offlined in very rapid succession resulting in data loss.

    However, just bouncing the request back up to the block layer isn't
    ideal either, because the whole initial request-retry sequence should
    be inside the write bitmap fence, which probably means that md needs
    to do its own conversion of WRITE SAME to write zero.

    Until the failure scenario has been sorted out, disable WRITE SAME for
    raid1, raid5, and raid10.

    [neilb: added raid5]

    This patch is appropriate for any -stable since 3.7 when write_same
    support was added.

    Cc: stable@vger.kernel.org
    Signed-off-by: H. Peter Anvin
    Signed-off-by: NeilBrown

    H. Peter Anvin
     
  • Various places in raid1 and raid10 are calling raise_barrier when they
    really should call freeze_array.
    The former is only intended to be called from "make_request".
    The later has extra checks for 'nr_queued' and makes a call to
    flush_pending_writes(), so it is safe to call it from within the
    management thread.

    Using raise_barrier will sometimes deadlock. Using freeze_array
    should not.

    As 'freeze_array' currently expects one request to be pending (in
    handle_read_error - the only previous caller), we need to pass
    it the number of pending requests (extra) to ignore.

    The deadlock was made particularly noticeable by commits
    050b66152f87c7 (raid10) and 6b740b8d79252f13 (raid1) which
    appeared in 3.4, so the fix is appropriate for any -stable
    kernel since then.

    This patch probably won't apply directly to some early kernels and
    will need to be applied by hand.

    Cc: stable@vger.kernel.org
    Reported-by: Alexander Lyakas
    Signed-off-by: NeilBrown

    NeilBrown
     
  • …ebuilding drive completed it.

    Without that fix, the following scenario could happen:

    - RAID1 with drives A and B; drive B was freshly-added and is rebuilding
    - Drive A fails
    - WRITE request arrives to the array. It is failed by drive A, so
    r1_bio is marked as R1BIO_WriteError, but the rebuilding drive B
    succeeds in writing it, so the same r1_bio is marked as
    R1BIO_Uptodate.
    - r1_bio arrives to handle_write_finished, badblocks are disabled,
    md_error()->error() does nothing because we don't fail the last drive
    of raid1
    - raid_end_bio_io() calls call_bio_endio()
    - As a result, in call_bio_endio():
    if (!test_bit(R1BIO_Uptodate, &r1_bio->state))
    clear_bit(BIO_UPTODATE, &bio->bi_flags);
    this code doesn't clear the BIO_UPTODATE flag, and the whole master
    WRITE succeeds, back to the upper layer.

    So we returned success to the upper layer, even though we had written
    the data onto the rebuilding drive only. But when we want to read the
    data back, we would not read from the rebuilding drive, so this data
    is lost.

    [neilb - applied identical change to raid10 as well]

    This bug can result in lost data, so it is suitable for any
    -stable kernel.

    Cc: stable@vger.kernel.org
    Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
    Signed-off-by: NeilBrown <neilb@suse.de>

    Alex Lyakas
     

09 May, 2013

1 commit

  • Pull block core updates from Jens Axboe:

    - Major bit is Kents prep work for immutable bio vecs.

    - Stable candidate fix for a scheduling-while-atomic in the queue
    bypass operation.

    - Fix for the hang on exceeded rq->datalen 32-bit unsigned when merging
    discard bios.

    - Tejuns changes to convert the writeback thread pool to the generic
    workqueue mechanism.

    - Runtime PM framework, SCSI patches exists on top of these in James'
    tree.

    - A few random fixes.

    * 'for-3.10/core' of git://git.kernel.dk/linux-block: (40 commits)
    relay: move remove_buf_file inside relay_close_buf
    partitions/efi.c: replace useless kzalloc's by kmalloc's
    fs/block_dev.c: fix iov_shorten() criteria in blkdev_aio_read()
    block: fix max discard sectors limit
    blkcg: fix "scheduling while atomic" in blk_queue_bypass_start
    Documentation: cfq-iosched: update documentation help for cfq tunables
    writeback: expose the bdi_wq workqueue
    writeback: replace custom worker pool implementation with unbound workqueue
    writeback: remove unused bdi_pending_list
    aoe: Fix unitialized var usage
    bio-integrity: Add explicit field for owner of bip_buf
    block: Add an explicit bio flag for bios that own their bvec
    block: Add bio_alloc_pages()
    block: Convert some code to bio_for_each_segment_all()
    block: Add bio_for_each_segment_all()
    bounce: Refactor __blk_queue_bounce to not use bi_io_vec
    raid1: use bio_copy_data()
    pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage
    pktcdvd: use bio_copy_data()
    block: Add bio_copy_data()
    ...

    Linus Torvalds
     

30 Apr, 2013

2 commits


24 Apr, 2013

1 commit

  • When an array is assembled incrementally with mdadm -I -R
    and the array switches to "active" mode, md starts a recovery.

    If the array was clean, the "fullsync" flag will be 0. Skip
    the full recovery in this case, as RAID1 does (the code was
    actually copied from the sync_request() method of RAID1).

    Signed-off-by: Martin Wilck
    Signed-off-by: NeilBrown

    Martin Wilck
     

24 Mar, 2013

4 commits

  • More prep work for immutable bio vecs, mainly getting rid of references
    to bi_idx.

    bio_reset was being open coded in a few places. The one in sync_request
    was a bit nontrivial to convert, so could use some extra eyeballs.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: NeilBrown
    Acked-by: NeilBrown

    Kent Overstreet
     
  • Random cleanup - this code was duplicated and it's not really specific
    to md.

    Also added the ability to return the actual error code.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: NeilBrown
    Acked-by: Tejun Heo

    Kent Overstreet
     
  • For immutable bvecs, all bi_idx usage needs to be audited - so here
    we're removing all the unnecessary uses.

    Most of these are places where it was being initialized on a bio that
    was just allocated, a few others are conversions to standard macros.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe

    Kent Overstreet
     
  • In the current code bio_split() won't be seeing partially completed bios
    so this doesn't change any behaviour, but this makes the code a bit
    clearer as to what bio_split() actually requires.

    The immediate purpose of the patch is removing unnecessary bi_idx
    references, but the end goal is to allow partial completed bios to be
    submitted, which along with immutable biovecs enables effecient bio
    splitting.

    Some of the callers were (double) checking that bios could be split, so
    update their checks too.

    Signed-off-by: Kent Overstreet
    CC: Jens Axboe
    CC: Lars Ellenberg
    CC: Neil Brown
    CC: Martin K. Petersen

    Kent Overstreet