11 May, 2010

17 commits

  • Lai Jiangshan noted that up to 10% of the RCU_SOFTIRQ are spurious, and
    traced this down to the fact that the current grace-period machinery
    will uselessly raise RCU_SOFTIRQ when a given CPU needs to go through
    a quiescent state, but has not yet done so. In this situation, there
    might well be nothing that RCU_SOFTIRQ can do, and the overhead can be
    worth worrying about in the ksoftirqd case. This patch therefore avoids
    raising RCU_SOFTIRQ in this situation.

    Changes since v1 (http://lkml.org/lkml/2010/3/30/122 from Lai Jiangshan):

    o Omit the rcu_qs_pending() prechecks, as they aren't that
    much less expensive than the quiescent-state checks.

    o Merge with the set_need_resched() patch that reduces IPIs.

    o Add the new n_rp_report_qs field to the rcu_pending tracing output.

    o Update the tracing documentation accordingly.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • TREE_RCU assumes that CPU numbering is contiguous, but some users need
    large holes in the numbering to better map to hardware layout. This patch
    makes TREE_RCU (and TREE_PREEMPT_RCU) tolerate large holes in the CPU
    numbering. However, NR_CPUS must still be greater than the largest
    CPU number.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The existing RCU CPU stall-warning messages can be confusing, especially
    in the case where one CPU detects a single other stalled CPU. In addition,
    the console messages did not say which flavor of RCU detected the stall,
    which can make it difficult to work out exactly what is causing the stall.
    This commit improves these messages.

    Requested-by: Dhaval Giani
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Print boot-time messages if tracing is enabled, if fanout is set
    to non-default values, if exact fanout is specified, if accelerated
    dyntick-idle grace periods have been enabled, if RCU-lockdep is enabled,
    if rcutorture has been boot-time enabled, if the CPU stall detector has
    been disabled, or if four-level hierarchy has been enabled.

    This is all for TREE_RCU and TREE_PREEMPT_RCU. TINY_RCU will be handled
    separately, if at all.

    Suggested-by: Josh Triplett
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The current RCU CPU stall warnings remain enabled even after a panic
    occurs, which some people have found to be a bit counterproductive.
    This patch therefore uses a notifier to disable stall warnings once a
    panic occurs.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The CPU_STALL_VERBOSE kernel configuration parameter was added to
    2.6.34 to identify any preempted/blocked tasks that were preventing
    the current grace period from completing when running preemptible
    RCU. As is conventional for new configurations parameters, this
    defaulted disabled. It is now time to enable it by default.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • TINY_RCU does not need rcu_scheduler_active unless CONFIG_DEBUG_LOCK_ALLOC.
    So conditionally compile rcu_scheduler_active in order to slim down
    rcutiny a bit more. Also gets rid of an EXPORT_SYMBOL_GPL, which is
    responsible for most of the slimming.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The addition of preemptible RCU to treercu resulted in a bit of
    confusion and inefficiency surrounding the handling of context switches
    for RCU-sched and for RCU-preempt. For RCU-sched, a context switch
    is a quiescent state, pure and simple, just like it always has been.
    For RCU-preempt, a context switch is in no way a quiescent state, but
    special handling is required when a task blocks in an RCU read-side
    critical section.

    However, the callout from the scheduler and the outer loop in ksoftirqd
    still calls something named rcu_sched_qs(), whose name is no longer
    accurate. Furthermore, when rcu_check_callbacks() notes an RCU-sched
    quiescent state, it ends up unnecessarily (though harmlessly, aside
    from the performance hit) enqueuing the current task if it happens to
    be running in an RCU-preempt read-side critical section. This not only
    increases the maximum latency of scheduler_tick(), it also needlessly
    increases the overhead of the next outermost rcu_read_unlock() invocation.

    This patch addresses this situation by separating the notion of RCU's
    context-switch handling from that of RCU-sched's quiescent states.
    The context-switch handling is covered by rcu_note_context_switch() in
    general and by rcu_preempt_note_context_switch() for preemptible RCU.
    This permits rcu_sched_qs() to handle quiescent states and only quiescent
    states. It also reduces the maximum latency of scheduler_tick(), though
    probably by much less than a microsecond. Finally, it means that tasks
    within preemptible-RCU read-side critical sections avoid incurring the
    overhead of queuing unless there really is a context switch.

    Suggested-by: Lai Jiangshan
    Acked-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Peter Zijlstra

    Paul E. McKenney
     
  • Make naming line up in preparation for CONFIG_TINY_PREEMPT_RCU.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Because synchronize_rcu_bh() is identical to synchronize_sched(),
    make the former a static inline invoking the latter, saving the
    overhead of an EXPORT_SYMBOL_GPL() and the duplicate code.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • The rcu_scheduler_active check has been wrapped into the new
    debug_lockdep_rcu_enabled() function, so update the comments to
    reflect this new reality.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • It is CONFIG_DEBUG_LOCK_ALLOC rather than CONFIG_PROVE_LOCKING, so fix it.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Offline CPUs are not in nohz_cpu_mask, but can be ignored when checking
    for the last non-dyntick-idle CPU. This patch therefore only checks
    online CPUs for not being dyntick idle, allowing fast entry into
    full-system dyntick-idle state even when there are some offline CPUs.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • Shrink the RCU_INIT_FLAVOR() macro by moving all but the initialization
    of the ->rda[] array to rcu_init_one(). The call to rcu_init_one()
    can then be moved to the end of the RCU_INIT_FLAVOR() macro, which is
    required because rcu_boot_init_percpu_data(), which is now called from
    rcu_init_one(), depends on the initialization of the ->rda[] array.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • cleanup: make dead code really dead

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     
  • This patch adds a check to __rcu_pending() that does a local
    set_need_resched() if the current CPU is holding up the current grace
    period and if force_quiescent_state() will be called soon. The goal is
    to reduce the probability that force_quiescent_state() will need to do
    smp_send_reschedule(), which sends an IPI and is therefore more expensive
    on most architectures.

    Signed-off-by: "Paul E. McKenney"

    Paul E. McKenney
     
  • There is no need to disable lockdep after an RCU lockdep splat,
    so remove the debug_lockdeps_off() from lockdep_rcu_dereference().
    To avoid repeated lockdep splats, use a static variable in the inlined
    rcu_dereference_check() and rcu_dereference_protected() macros so that
    a given instance splats only once, but so that multiple instances can
    be detected per boot.

    This is controlled by a new config variable CONFIG_PROVE_RCU_REPEATEDLY,
    which is disabled by default. This provides the normal lockdep behavior
    by default, but permits people who want to find multiple RCU-lockdep
    splats per boot to easily do so.

    Requested-by: Eric Paris
    Signed-off-by: Lai Jiangshan
    Tested-by: Eric Paris
    Signed-off-by: Paul E. McKenney

    Lai Jiangshan
     

10 May, 2010

3 commits

  • Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
    [SCSI] Retry commands with UNIT_ATTENTION sense codes to fix ext3/ext4 I/O error
    [SCSI] Enable retries for SYNCRONIZE_CACHE commands to fix I/O error
    [SCSI] scsi_debug: virtual_gb ignores sector_size
    [SCSI] libiscsi: regression: fix header digest errors
    [SCSI] fix locking around blk_abort_request()
    [SCSI] advansys: fix narrow board error path

    Linus Torvalds
     
  • commit 672917dcc78 ("cpuidle: menu governor: reduce latency on exit")
    added an optimization, where the analysis on the past idle period moved
    from the end of idle, to the beginning of the new idle.

    Unfortunately, this optimization had a bug where it zeroed one key
    variable for new use, that is needed for the analysis. The fix is
    simple, zero the variable after doing the work from the previous idle.

    During the audit of the code that found this issue, another issue was
    also found; the ->measured_us data structure member is never set, a
    local variable is always used instead.

    Signed-off-by: Arjan van de Ven
    Cc: Corrado Zoccolo
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

08 May, 2010

9 commits


07 May, 2010

11 commits

  • Some time ago we stopped the clean/active metadata updates
    from being written to a 'spare' device in most cases so that
    it could spin down and say spun down. Device failure/removal
    etc are still recorded on spares.

    However commit 51d5668cb2e3fd1827a55 broke this 50% of the time,
    depending on whether the event count is even or odd.
    The change log entry said:

    This means that the alignment between 'odd/even' and
    'clean/dirty' might take a little longer to attain,

    how ever the code makes no attempt to create that alignment, so it
    could take arbitrarily long.

    So when we find that clean/dirty is not aligned with odd/even,
    force a second metadata-update immediately. There are already cases
    where a second metadata-update is needed immediately (e.g. when a
    device fails during the metadata update). We just piggy-back on that.

    Reported-by: Joe Bryant
    Signed-off-by: NeilBrown
    Cc: stable@kernel.org

    NeilBrown
     
  • Fix: Raid-6 was not trying to correct a read-error when in
    singly-degraded state and was instead dropping one more device, going to
    doubly-degraded state. This patch fixes this behaviour.

    Tested-by: Janos Haar
    Signed-off-by: Gabriele A. Trombetti
    Reported-by: Janos Haar
    Signed-off-by: NeilBrown
    Cc: stable@kernel.org

    Gabriele A. Trombetti
     
  • with CONFIG_PROVE_RCU=y, a warning can be triggered:

    # mount -t cgroup -o blkio xxx /mnt
    # mkdir /mnt/subgroup

    ...
    kernel/cgroup.c:4442 invoked rcu_dereference_check() without protection!
    ...

    To fix this, we avoid caling css_depth() here, which is a bit simpler
    than the original code.

    Signed-off-by: Li Zefan
    Acked-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Li Zefan
     
  • …5903' and 'misc-2.6.34' into release

    Len Brown
     
  • It's unused and buggy in its current form, since it can place a bo
    in the reserved state without removing it from lru lists.

    Signed-off-by: Thomas Hellstrom
    Signed-off-by: Dave Airlie

    Thomas Hellstrom
     
  • Signed-off-by: Thomas Hellstrom
    Signed-off-by: Dave Airlie

    Thomas Hellstrom
     
  • Bring radeon up to speed with the async event synchronization for
    drmWaitVblank. See c9a9c5e02aedc1a2815877b0268f886d2640b771 for
    more information. Without this patch event never get delivered
    to userspace client.

    Signed-off-by: Jerome Glisse
    Signed-off-by: Dave Airlie

    Jerome Glisse
     
  • Move the fifo reset from pxa_camera_start_capture to pxa_camera_irq direct
    before the dma start after an end of frame interrupt to prevent images from
    shifting because of old data at the begin of the frame.

    Signed-off-by: Stefan Herbrechtsmeier
    Acked-by: Robert Jarzmik
    Tested-by: Antonio Ospite
    Signed-off-by: Guennadi Liakhovetski
    Signed-off-by: Mauro Carvalho Chehab

    Stefan Herbrechtsmeier
     
  • soc_mbus_bytes_per_line() returns -EINVAL on error but we store it in an
    unsigned int so the test for less than zero doesn't work. I think it
    always returns "small" positive values so we can just cast it to int
    here.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Guennadi Liakhovetski
    Signed-off-by: Mauro Carvalho Chehab

    Dan Carpenter
     
  • This fixes a regression of

    7d58289 (mx1: prefix SOC specific defines with MX1_ and deprecate old names)

    Signed-off-by: Uwe Kleine-König
    Acked-by: Sascha Hauer
    Signed-off-by: Guennadi Liakhovetski
    Signed-off-by: Mauro Carvalho Chehab

    Uwe Kleine-König
     
  • Never call dvb_frontend_detach if we failed to attach a frontend. This fixes
    the following oops, which will be triggered by a missing stv090x module:

    [ 8.172997] DVB: registering new adapter (TT-Budget S2-1600 PCI)
    [ 8.209018] adapter has MAC addr = 00:d0:5c:cc:a7:29
    [ 8.328665] Intel ICH 0000:00:1f.5: PCI INT B -> GSI 17 (level, low) -> IRQ 17
    [ 8.328753] Intel ICH 0000:00:1f.5: setting latency timer to 64
    [ 8.562047] DVB: Unable to find symbol stv090x_attach()
    [ 8.562117] BUG: unable to handle kernel NULL pointer dereference at 000000ac
    [ 8.562239] IP: [] dvb_frontend_detach+0x4/0x67 [dvb_core]

    Ref http://bugs.debian.org/575207

    Signed-off-by: Bjørn Mork
    Cc: stable@kernel.org
    Signed-off-by: Mauro Carvalho Chehab

    Bjørn Mork