08 Mar, 2010

1 commit

  • Constify struct sysfs_ops.

    This is part of the ops structure constification
    effort started by Arjan van de Ven et al.

    Benefits of this constification:

    * prevents modification of data that is shared
    (referenced) by many other structure instances
    at runtime

    * detects/prevents accidental (but not intentional)
    modification attempts on archs that enforce
    read-only kernel data at runtime

    * potentially better optimized code as the compiler
    can assume that the const data cannot be changed

    * the compiler/linker move const data into .rodata
    and therefore exclude them from false sharing

    Signed-off-by: Emese Revfy
    Acked-by: David Teigland
    Acked-by: Matt Domsch
    Acked-by: Maciej Sosnowski
    Acked-by: Hans J. Koch
    Acked-by: Pekka Enberg
    Acked-by: Jens Axboe
    Acked-by: Stephen Hemminger
    Signed-off-by: Greg Kroah-Hartman

    Emese Revfy
     

06 Mar, 2010

14 commits

  • To prevent deadlock, bios in the hold list should be flushed before
    dm_rh_stop_recovery() is called in mirror_suspend().

    The recovery can't start because there are pending bios and therefore
    dm_rh_stop_recovery deadlocks.

    When there are pending bios in the hold list, the recovery waits for
    the completion of the bios after recovery_count is acquired.
    The recovery_count is released when the recovery finished, however,
    the bios in the hold list are processed after dm_rh_stop_recovery() in
    mirror_presuspend(). dm_rh_stop_recovery() also acquires recovery_count,
    then deadlock occurs.

    Signed-off-by: Takahiro Yasui
    Signed-off-by: Alasdair G Kergon
    Reviewed-by: Mikulas Patocka

    Takahiro Yasui
     
  • Eliminate a 4-byte hole in 'struct dm_io_memory' by moving 'offset' above the
    'ptr' to which it applies (size reduced from 24 to 16 bytes). And by
    association, 1-4 byte hole is eliminated in 'struct dm_io_request' (size
    reduced from 56 to 48 bytes).

    Eliminate all 6 4-byte holes and 1 cache-line in 'struct dm_snapshot' (size
    reduced from 392 to 368 bytes).

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • Set a new DM_UEVENT_GENERATED_FLAG when returning from ioctls to
    indicate that a uevent was actually generated. This tells the userspace
    caller that it may need to wait for the event to be processed.

    Signed-off-by: Peter Rajnoha
    Signed-off-by: Alasdair G Kergon

    Peter Rajnoha
     
  • Free the dm_io structure before calling bio_endio() instead of after it,
    to ensure that the io_pool containing it is not referenced after it is
    freed.

    This partially fixes a problem described here
    https://www.redhat.com/archives/dm-devel/2010-February/msg00109.html

    thread 1:
    bio_endio(bio, io_error);
    /* scheduling happens */
    thread 2:
    close the device
    remove the device
    thread 1:
    free_io(md, io);

    Thread 2, when removing the device, sees non-empty md->io_pool (because the
    io hasn't been freed by thread 1 yet) and may crash with BUG in mempool_free.
    Thread 1 may also crash, when freeing into a nonexisting mempool.

    To fix this we must make sure that bio_endio() is the last call and
    the md structure is not accessed afterwards.

    There is another bio_endio in process_barrier, but it is called from the thread
    and the thread is destroyed prior to freeing the mempools, so this call is
    not affected by the bug.

    A similar bug exists with module unloads - the module may be unloaded
    immediately after bio_endio - but that is more difficult to fix.

    Signed-off-by: Mikulas Patocka
    Cc: stable@kernel.org
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • Remove unused parameters(start and len) of dm_get_device()
    and fix the callers.

    Signed-off-by: Nikanth Karthikesan
    Signed-off-by: Alasdair G Kergon

    Nikanth Karthikesan
     
  • Only issue a uevent on a resume if the state of the device changed,
    i.e. if it was suspended and/or its table was replaced.

    Signed-off-by: Dave Wysochanski
    Signed-off-by: Mike Snitzer
    Cc: stable@kernel.org
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • If all mirror legs fail, always return an error instead of holding the
    bio, even if the handle_errors option was set. At present it is the
    responsibility of the driver underneath us to deal with retries,
    multipath etc.

    The patch adds the bio to the failures list instead of holding it
    directly. do_failures tests first if all legs failed and, if so,
    returns the bio with -EIO. If any leg is still alive and handle_errors
    is set, do_failures calls hold_bio.

    Reviewed-by: Takahiro Yasui
    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • This patch pulls the pg_init path activation code out of
    process_queued_ios() into a new function.

    No functional change.

    Signed-off-by: Kiyoshi Ueda
    Signed-off-by: Jun'ichi Nomura
    Signed-off-by: Alasdair G Kergon

    Kiyoshi Ueda
     
  • When suspending the device we must wait for all I/O to complete, but
    pg-init may be still in progress even after flushing the workqueue
    for kmpath_handlerd in multipath_postsuspend.

    This patch waits for pg-init completion correctly in
    multipath_postsuspend().

    Signed-off-by: Kiyoshi Ueda
    Signed-off-by: Jun'ichi Nomura
    Signed-off-by: Alasdair G Kergon

    Kiyoshi Ueda
     
  • m->queue_io is set to block processing I/Os, and it needs to be kept
    while pg-init, which issues multiple path activations, is in progress.
    But m->queue is cleared when a path activation completes without error
    in pg_init_done(), even while other path activations are in progress.
    That may cause undesired -EIO on paths which are not complete activation.

    This patch fixes that by not clearing m->queue_io until all path
    activations complete.

    (Before the hardware handlers were moved into the SCSI layer, pg_init
    only used one path.)

    Signed-off-by: Kiyoshi Ueda
    Signed-off-by: Jun'ichi Nomura
    Signed-off-by: Alasdair G Kergon

    Kiyoshi Ueda
     
  • 'suspended' flag in struct multipath was introduced to check whether
    the multipath target is in suspended state, but the same check is
    done through dm_suspended() now, so remove the flag and related code.

    Signed-off-by: Kiyoshi Ueda
    Signed-off-by: Jun'ichi Nomura
    Cc: Mike Anderson
    Signed-off-by: Alasdair G Kergon

    Kiyoshi Ueda
     
  • Remove the dm_get() in dm_table_get_md() because dm_table_get_md() could
    be called from presuspend/postsuspend, which are called while
    mapped_device is in DMF_FREEING state, where dm_get() is not allowed.

    Justification for that is the lifetime of both objects: As far as the
    current dm design/implementation, mapped_device is never freed while
    targets are doing something, because dm core waits for targets to become
    quiet in dm_put() using presuspend/postsuspend. So targets should be
    able to touch mapped_device without holding reference count of the
    mapped_device, and we should allow targets to touch mapped_device even
    if it is in DMF_FREEING state.

    Backgrounds:
    I'm trying to remove the multipath internal queue, since dm core now has
    a generic queue for request-based dm. In the patch-set, the multipath
    target wants to request dm core to start/stop queue. One of such
    start/stop requests can happen during postsuspend() while the target
    waits for pg-init to complete, because the target stops queue when
    starting pg-init and tries to restart it when completing pg-init. Since
    queue belongs to mapped_device, it involves calling dm_table_get_md()
    and dm_put(). On the other hand, postsuspend() is called in dm_put()
    for mapped_device which is in DMF_FREEING state, and that triggers
    BUG_ON(DMF_FREEING) in the 2nd dm_put().

    I had tried to solve this problem by changing only multipath not to
    touch mapped_device which is in DMF_FREEING state, but I couldn't and I
    came up with a question why we need dm_get() in dm_table_get_md().

    Signed-off-by: Kiyoshi Ueda
    Signed-off-by: Jun'ichi Nomura
    Signed-off-by: Alasdair G Kergon

    Kiyoshi Ueda
     
  • This patch adds two minor fixes while processing device mapper path activation.

    Skip failed paths while calling activate_path. If the path is already failed
    then activate_path will fail for sure. We don't have to call in that case. In
    some case this might cause prolonged retries unnecessarily.

    Change the misleading message if the path being activated fails with SCSI_DH_NOSYS.

    Signed-off-by: Babu Moger
    Signed-off-by: Alasdair G Kergon

    Moger, Babu
     
  • This patch removes some unnecessary argument casting. There is no
    functional change with this patch.

    Passes 'struct pgpath' through to pg_init_done() instead of the enclosed
    'struct dm_path'.

    Tested the changes with LSI storage..

    CC: Chandra Seetharaman
    Signed-off-by: Babu Moger
    Acked-by: Kiyoshi Ueda
    Signed-off-by: Alasdair G Kergon

    Moger, Babu
     

03 Mar, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: add __percpu sparse annotations to what's left
    percpu: add __percpu sparse annotations to fs
    percpu: add __percpu sparse annotations to core kernel subsystems
    local_t: Remove leftover local.h
    this_cpu: Remove pageset_notifier
    this_cpu: Page allocator conversion
    percpu, x86: Generic inc / dec percpu instructions
    local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c
    module: Use this_cpu_xx to dynamically allocate counters
    local_t: Remove cpu_local_xx macros
    percpu: refactor the code in pcpu_[de]populate_chunk()
    percpu: remove compile warnings caused by __verify_pcpu_ptr()
    percpu: make accessors check for percpu pointer in sparse
    percpu: add __percpu for sparse.
    percpu: make access macros universal
    percpu: remove per_cpu__ prefix.

    Linus Torvalds
     

26 Feb, 2010

2 commits


17 Feb, 2010

8 commits

  • Add __percpu sparse annotations to places which didn't make it in one
    of the previous patches. All converions are trivial.

    These annotations are to make sparse consider percpu variables to be
    in a different address space and warn if accessed without going
    through percpu accessors. This patch doesn't affect normal builds.

    Signed-off-by: Tejun Heo
    Acked-by: Borislav Petkov
    Cc: Dan Williams
    Cc: Huang Ying
    Cc: Len Brown
    Cc: Neil Brown

    Tejun Heo
     
  • Revert commit d2bb7df8cac647b92f51fb84ae735771e7adbfa7 at Greg's request.

    Author: Milan Broz
    Date: Thu Dec 10 23:51:53 2009 +0000

    dm: sysfs add empty release function to avoid debug warning

    This patch just removes an unnecessary warning:
    kobject: 'dm': does not have a release() function,
    it is broken and must be fixed.

    The kobject is embedded in mapped device struct, so
    code does not need to release memory explicitly here.

    Cc: Greg KH
    Signed-off-by: Alasdair G Kergon

    Alasdair G Kergon
     
  • This patch fixes the problem that system may stall if target's ->map_rq
    returns DM_MAPIO_REQUEUE in map_request().
    E.g. stall happens on 1 CPU box when a dm-mpath device with queue_if_no_path
    bounces between all-paths-down and paths-up on I/O load.

    When target's ->map_rq returns DM_MAPIO_REQUEUE, map_request() requeues
    the request and returns to dm_request_fn(). Then, dm_request_fn()
    doesn't exit the I/O dispatching loop and continues processing
    the requeued request again.
    This map and requeue loop can be done with interrupt disabled,
    so 1 CPU system can be stalled if this situation happens.

    For example, commands below can stall my 1 CPU box within 1 minute or so:
    # dmsetup table mp
    mp: 0 2097152 multipath 1 queue_if_no_path 0 1 1 service-time 0 1 2 8:144 1 1
    # while true; do dd if=/dev/mapper/mp of=/dev/null bs=1M count=100; done &
    # while true; do \
    > dmsetup message mp 0 "fail_path 8:144" \
    > dmsetup suspend --noflush mp \
    > dmsetup resume mp \
    > dmsetup message mp 0 "reinstate_path 8:144" \
    > done

    To fix the problem above, this patch changes dm_request_fn() to exit
    the I/O dispatching loop once if a request is requeued in map_request().

    Signed-off-by: Kiyoshi Ueda
    Signed-off-by: Jun'ichi Nomura
    Cc: stable@kernel.org
    Signed-off-by: Alasdair G Kergon

    Kiyoshi Ueda
     
  • When suspending a failed mirror, bios are completed by mirror_end_io() and
    __rh_lookup() in dm_rh_dec() returns NULL where a non-NULL return value is
    required by design. Fix this by not changing the state of the recovery failed
    region from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end().

    Issue

    On 2.6.33-rc1 kernel, I hit the bug when I suspended the failed
    mirror by dmsetup command.

    BUG: unable to handle kernel NULL pointer dereference at 00000020
    IP: [] dm_rh_dec+0x35/0xa1 [dm_region_hash]
    ...
    EIP: 0060:[] EFLAGS: 00010046 CPU: 0
    EIP is at dm_rh_dec+0x35/0xa1 [dm_region_hash]
    EAX: 00000286 EBX: 00000000 ECX: 00000286 EDX: 00000000
    ESI: eff79eac EDI: eff79e80 EBP: f6915cd4 ESP: f6915cc4
    DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    Process dmsetup (pid: 2849, ti=f6914000 task=eff03e80 task.ti=f6914000)
    ...
    Call Trace:
    [] ? mirror_end_io+0x53/0x1b1 [dm_mirror]
    [] ? clone_endio+0x4d/0xa2 [dm_mod]
    [] ? mirror_end_io+0x0/0x1b1 [dm_mirror]
    [] ? clone_endio+0x0/0xa2 [dm_mod]
    [] ? bio_endio+0x28/0x2b
    [] ? hold_bio+0x2d/0x62 [dm_mirror]
    [] ? mirror_presuspend+0xeb/0xf7 [dm_mirror]
    [] ? vmap_page_range+0xb/0xd
    [] ? suspend_targets+0x2d/0x3b [dm_mod]
    [] ? dm_table_presuspend_targets+0xe/0x10 [dm_mod]
    [] ? dm_suspend+0x4d/0x150 [dm_mod]
    [] ? dev_suspend+0x55/0x18a [dm_mod]
    [] ? _copy_from_user+0x42/0x56
    [] ? dm_ctl_ioctl+0x22c/0x281 [dm_mod]
    [] ? dev_suspend+0x0/0x18a [dm_mod]
    [] ? dm_ctl_ioctl+0x0/0x281 [dm_mod]
    [] ? vfs_ioctl+0x22/0x85
    [] ? do_vfs_ioctl+0x4cb/0x516
    [] ? sys_ioctl+0x40/0x5a
    [] ? sysenter_do_call+0x12/0x28

    Analysis

    When recovery process of a region failed, dm_rh_recovery_end() function
    changes the state of the region from RM_RH_RECOVERING to DM_RH_NOSYNC.
    When recovery_complete() is executed between dm_rh_update_states() and
    dm_writes() in do_mirror(), bios are processed with the region state,
    DM_RH_NOSYNC. However, the region data is freed without checking its
    pending count when dm_rh_update_states() is called next time.

    When bios are finished by mirror_end_io(), __rh_lookup() in dm_rh_dec()
    returns NULL even though a valid return value are expected.

    Solution

    Remove the state change of the recovery failed region from DM_RH_RECOVERING
    to DM_RH_NOSYNC in dm_rh_recovery_end(). We can remove the state change
    because:

    - If the region data has been released by dm_rh_update_states(),
    a new region data is created with the state of DM_RH_NOSYNC, and
    bios are processed according to the DM_RH_NOSYNC state.

    - If the region data has not been released by dm_rh_update_states(),
    a state of the region is DM_RH_RECOVERING and bios are put in the
    delayed_bio list.

    The flag change from DM_RH_RECOVERING to DM_RH_NOSYNC in dm_rh_recovery_end()
    was added in the following commit:
    dm raid1: handle resync failures
    author Jonathan Brassow
    Thu, 12 Jul 2007 16:29:04 +0000 (17:29 +0100)
    http://git.kernel.org/linus/f44db678edcc6f4c2779ac43f63f0b9dfa28b724

    Signed-off-by: Takahiro Yasui
    Reviewed-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Takahiro Yasui
     
  • If the mirror log fails when the handle_errors option was not selected
    and there is no remaining valid mirror leg, writes return success even
    though they weren't actually written to any device. This patch
    completes them with EIO instead.

    This code path is taken:
    do_writes:
    bio_list_merge(&ms->failures, &sync);
    do_failures:
    if (!get_valid_mirror(ms)) (false)
    else if (errors_handled(ms)) (false)
    else bio_endio(bio, 0);

    The logic in do_failures is based on presuming that the write was already
    tried: if it succeeded at least on one leg (without handle_errors) it
    is reported as success.

    Reference: https://bugzilla.redhat.com/show_bug.cgi?id=555197

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon

    Mikulas Patocka
     
  • This patch fixes two bugs that revolve around the miscalculation and
    misuse of the variable 'overhead_size'. 'overhead_size' is the size of
    the various header structures used during communication.

    The first bug is the use of 'sizeof' with the pointer of a structure
    instead of the structure itself - resulting in the wrong size being
    computed. This is then used in a check to see if the payload
    (data_size) would be to large for the preallocated structure. Since the
    bug produces a smaller value for the overhead, it was possible for the
    structure to be breached. (Although the current users of the code do
    not currently send enough data to trigger this bug.)

    The second bug is that the 'overhead_size' value is used to compute how
    much of the preallocated space should be cleared before populating it
    with fresh data. This should have simply been 'sizeof(struct cn_msg)'
    not overhead_size. The fact that 'overhead_size' was computed
    incorrectly made this problem "less bad" - leaving only a pointer's
    worth of space at the end uncleared. Thus, this bug was never producing
    a bad result, but still needs to be fixed - especially now that the
    value is computed correctly.

    Cc: stable@kernel.org
    Signed-off-by: Jonathan Brassow

    Jonathan Brassow
     
  • chunk_io() declares its 'struct mdata_req' on the stack and then
    initializes its 'struct work_struct' member. Annotate the
    initialization of this workqueue with INIT_WORK_ON_STACK to suppress a
    debugobjects warning seen when CONFIG_DEBUG_OBJECTS_WORK is enabled.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon

    Mike Snitzer
     
  • If a table containing zero as stripe count is passed into stripe_ctr
    the code attempts to divide by zero.

    This patch changes DM_TABLE_LOAD to return -EINVAL if the stripe count
    is zero.

    We now get the following error messages:
    device-mapper: table: 253:0: striped: Invalid stripe count
    device-mapper: ioctl: error adding target to table

    Signed-off-by: Nikanth Karthikesan
    Cc: stable@kernel.org
    Signed-off-by: Alasdair G Kergon

    Nikanth Karthikesan
     

10 Feb, 2010

1 commit

  • ======
    This fix is related to
    http://bugzilla.kernel.org/show_bug.cgi?id=15142
    but does not address that exact issue.
    ======

    sysfs does like attributes being removed while they are being accessed
    (i.e. read or written) and waits for the access to complete.

    As accessing some md attributes takes the same lock that is held while
    removing those attributes a deadlock can occur.

    This patch addresses 3 issues in md that could lead to this deadlock.

    Two relate to calling flush_scheduled_work while the lock is held.
    This is probably a bad idea in general and as we use schedule_work to
    delete various sysfs objects it is particularly bad.

    In one case flush_scheduled_work is called from md_alloc (called by
    md_probe) called from do_md_run which holds the lock. This call is
    only present to ensure that ->gendisk is set. However we can be sure
    that gendisk is always set (though possibly we couldn't when that code
    was originally written. This is because do_md_run is called in three
    different contexts:
    1/ from md_ioctl. This requires that md_open has succeeded, and it
    fails if ->gendisk is not set.
    2/ from writing a sysfs attribute. This can only happen if the
    mddev has been registered in sysfs which happens in md_alloc
    after ->gendisk has been set.
    3/ from autorun_array which is only called by autorun_devices, which
    checks for ->gendisk to be set before calling autorun_array.
    So the call to md_probe in do_md_run can be removed, and the check on
    ->gendisk can also go.

    In the other case flush_scheduled_work is being called in do_md_stop,
    purportedly to wait for all md_delayed_delete calls (which delete the
    component rdevs) to complete. However there really isn't any need to
    wait for them - they have already been disconnected in all important
    ways.

    The third issue is that raid5->stop() removes some attribute names
    while the lock is held. There is already some infrastructure in place
    to delay attribute removal until after the lock is released (using
    schedule_work). So extend that infrastructure to remove the
    raid5_attrs_group.

    This does not address all lockdep issues related to the sysfs
    "s_active" lock. The rest can be address by splitting that lockdep
    context between symlinks and non-symlinks which hopefully will happen.

    Signed-off-by: NeilBrown

    NeilBrown
     

09 Feb, 2010

1 commit

  • This code was written long ago when it was not possible to
    reshape a degraded array. Now it is so the current level of
    degraded-ness needs to be taken in to account. Also newly addded
    devices should only reduce degradedness if they are deemed to be
    in-sync.

    In particular, if you convert a RAID5 to a RAID6, and increase the
    number of devices at the same time, then the 5->6 conversion will
    make the array degraded so the current code will produce a wrong
    value for 'degraded' - "-1" to be precise.

    If the reshape runs to completion end_reshape will calculate a correct
    new value for 'degraded', but if a device fails during the reshape an
    incorrect decision might be made based on the incorrect value of
    "degraded".

    This patch is suitable for 2.6.32-stable and if they are still open,
    2.6.31-stable and 2.6.30-stable as well.

    Cc: stable@kernel.org
    Reported-by: Michael Evans
    Signed-off-by: NeilBrown

    NeilBrown
     

11 Jan, 2010

1 commit

  • Make DM use bdev_stack_limits() function so that partition offsets get
    taken into account when calculating alignment. Clarify stacking
    warnings.

    Also remove obsolete clearing of final alignment_offset and misalignment
    flag.

    Signed-off-by: Martin K. Petersen
    Signed-off-by: Mike Snitzer
    Cc: Alasdair G. Kergon
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

30 Dec, 2009

5 commits

  • If two arrays share a device, then they will not both resync at the
    same time. One will wait for the other to complete.
    While waiting, the MD_RECOVERY_INTR flag is not checked so a device
    failure, which would make the resync pointless, does not cause the
    resync to abort, so the failed device cannot be removed (as it cannot
    be remove while a resync is happening).

    So add a test for MD_RECOVERY_INTR.

    Reported-by: Brett Russ
    Signed-off-by: NeilBrown

    NeilBrown
     
  • Since commit dfc7064500061677720fa26352963c772d3ebe6b,
    ->hot_remove_disks has not removed non-failed devices from
    an array until recovery is no longer possible.
    So the code in do_md_run to get around the fact that
    md_check_recovery (which calls ->hot_remove_disks) would
    remove partially-in-sync devices is no longer needed.

    So remove it.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • By default md_do_sync() will perform recovery if no other actions are
    specified. However, action_show() relies on MD_RECOVERY_RECOVER to be
    set otherwise it returns 'idle'. So, add a missing set
    MD_RECOVERY_RECOVER when starting recovery.

    Signed-off-by: Dan Williams
    Signed-off-by: NeilBrown

    Dan Williams
     
  • The start_ro modules parameter can be used to force arrays to be
    started in 'auto-readonly' in which they are read-only until the first
    write. This ensures that no resync/recovery happens until something
    else writes to the device. This is important for resume-from-disk
    off an md array.

    However if an array is started 'readonly' (by writing 'readonly' to
    the 'array_state' sysfs attribute) we want it to be really 'readonly',
    not 'auto-readonly'.

    So strengthen the condition to only set auto-readonly if the
    array is not already read-only.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • evms configures md arrays by:
    open device
    send ioctl
    close device

    for each different ioctl needed.
    Since 2.6.29, the device can disappear after the 'close'
    unless a significant configuration has happened to the device.
    The change made by "SET_ARRAY_INFO" can too minor to stop the device
    from disappearing, but important enough that losing the change is bad.

    So: make sure SET_ARRAY_INFO sets mddev->ctime, and keep the device
    active as long as ctime is non-zero (it gets zeroed with lots of other
    things when the array is stopped).

    This is suitable for -stable kernels since 2.6.29.

    Signed-off-by: NeilBrown
    Cc: stable@kernel.org

    NeilBrown
     

16 Dec, 2009

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: (80 commits)
    dm snapshot: use merge origin if snapshot invalid
    dm snapshot: report merge failure in status
    dm snapshot: merge consecutive chunks together
    dm snapshot: trigger exceptions in remaining snapshots during merge
    dm snapshot: delay merging a chunk until writes to it complete
    dm snapshot: queue writes to chunks being merged
    dm snapshot: add merging
    dm snapshot: permit only one merge at once
    dm snapshot: support barriers in snapshot merge target
    dm snapshot: avoid allocating exceptions in merge
    dm snapshot: rework writing to origin
    dm snapshot: add merge target
    dm exception store: add merge specific methods
    dm snapshot: create function for chunk_is_tracked wait
    dm snapshot: make bio optional in __origin_write
    dm mpath: reject messages when device is suspended
    dm: export suspended state to targets
    dm: rename dm_suspended to dm_suspended_md
    dm: swap target postsuspend call and setting suspended flag
    dm crypt: add plain64 iv
    ...

    Linus Torvalds
     
  • Signed-off-by: Joe Perches
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Makes use of skip_spaces() defined in lib/string.c for removing leading
    spaces from strings all over the tree.

    It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
    text data bss dec hex filename
    64688 584 592 65864 10148 (TOTALS-BEFORE)
    64641 584 592 65817 10119 (TOTALS-AFTER)

    Also, while at it, if we see (*str && isspace(*str)), we can be sure to
    remove the first condition (*str) as the second one (isspace(*str)) also
    evaluates to 0 whenever *str == 0, making it redundant. In other words,
    "a char equals zero is never a space".

    Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
    and found occurrences of this pattern on 3 more files:
    drivers/leds/led-class.c
    drivers/leds/ledtrig-timer.c
    drivers/video/output.c

    @@
    expression str;
    @@

    ( // ignore skip_spaces cases
    while (*str && isspace(*str)) { \(str++;\|++str;\) }
    |
    - *str &&
    isspace(*str)
    )

    Signed-off-by: André Goddard Rosa
    Cc: Julia Lawall
    Cc: Martin Schwidefsky
    Cc: Jeff Dike
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Richard Purdie
    Cc: Neil Brown
    Cc: Kyle McMartin
    Cc: Henrique de Moraes Holschuh
    Cc: David Howells
    Cc:
    Cc: Samuel Ortiz
    Cc: Patrick McHardy
    Cc: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    André Goddard Rosa
     

14 Dec, 2009

3 commits

  • Enable external metadata arrays to manage rebuild checkpointing via a
    md/dev-XXX/recovery_start attribute which reflects rdev->recovery_offset

    Also update resync_start_store to allow 'none' to be written, for
    consistency.

    Signed-off-by: Dan Williams
    Signed-off-by: NeilBrown

    Dan Williams
     
  • Other walks of this list are either under rcu_read_lock() or the list
    mutation lock (mddev_lock()). This protects against the improbable case of a
    disk being removed from the array at the start of md_do_sync().

    Signed-off-by: Dan Williams

    Dan Williams
     
  • As v1.x metadata can record that a member of the array is
    not completely recovered, it make sense to record that a
    spare has become a regular member of the array at the earliest
    opportunity.
    So remove the tests on "recovery_offset > 0" in super_1_sync
    as they really aren't needed, and schedule a metadata update
    immediately after adding spares to a degraded array.

    This means that if a crash happens immediately after a recovery
    starts, the new device will be included in the array and recovery will
    continue from wherever it was up to. Previously this didn't happen
    unless recovery was at least 1/16 of the way through.

    Signed-off-by: NeilBrown

    NeilBrown