23 May, 2011

3 commits


21 May, 2011

17 commits

  • Signed-off-by: Gustavo F. Padovan
    Signed-off-by: Jens Axboe

    Gustavo F. Padovan
     
  • We don't need them anymore, so kill:

    - REQ_ON_PLUG checks in various places
    - !rq_mergeable() check in plug merging

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This patch merges in a fix that missed 2.6.39 final.

    Conflicts:
    block/blk.h

    Jens Axboe
     
  • Currently we take a queue lock on each bio to check if there are any
    throttling rules associated with the group and also update the stats.
    Now access the group under rcu and update the stats without taking
    the queue lock. Queue lock is taken only if there are throttling rules
    associated with the group.

    So the common case of root group when there are no rules, save
    unnecessary pounding of request queue lock.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Now dispatch stats update is lock free. But reset of these stats still
    takes blkg->stats_lock and is dependent on that. As stats are per cpu,
    we should be able to just reset the stats on each cpu without any locks.
    (Atleast for 64bit arch).

    On 32bit arch there is a small race where 64bit updates are not atomic.
    The result of this race can be that in the presence of other writers,
    one might not get 0 value after reset of a stat and might see something
    intermediate

    One can write more complicated code to cover this race like sending IPI
    to other cpus to reset stats and for offline cpus, reset these directly.

    Right not I am not taking that path because reset_update is more of a
    debug feature and it can happen only on 32bit arch and possibility of
    it happening is small. Will fix it if it becomes a real problem. For
    the time being going for code simplicity.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Some of the stats are 64bit and updation will be non atomic on 32bit
    architecture. Use sequence counters on 32bit arch to make reading
    of stats safe.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Currently we take blkg_stat lock for even updating the stats. So even if
    a group has no throttling rules (common case for root group), we end
    up taking blkg_lock, for updating the stats.

    Make dispatch stats per cpu so that these can be updated without taking
    blkg lock.

    If cpu goes offline, these stats simply disappear. No protection has
    been provided for that yet. Do we really need anything for that?

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Soon we will allow accessing a throtl_grp under rcu_read_lock(). Hence
    start freeing up throtl_grp after one rcu grace period.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Use same helper function for root group as we use with dynamically
    allocated groups to add it to various lists.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • A helper function for the code which is used at 2-3 places. Makes reading
    code little easier.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Currently, we allocate root throtl_grp statically. But as we will be
    introducing per cpu stat pointers and that will be allocated
    dynamically even for root group, we might as well make whole root
    throtl_grp allocation dynamic and treat it in same manner as other
    groups.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Currently, all the cfq_group or throtl_group allocations happen while
    we are holding ->queue_lock and sleeping is not allowed.

    Soon, we will move to per cpu stats and also need to allocate the
    per group stats. As one can not call alloc_percpu() from atomic
    context as it can sleep, we need to drop ->queue_lock, allocate the
    group, retake the lock and continue processing.

    In throttling code, I check the queue DEAD flag again to make sure
    that driver did not call blk_cleanup_queue() in the mean time.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • blkg->key = cfqd is an rcu protected pointer and hence we used to do
    call_rcu(cfqd->rcu_head) to free up cfqd after one rcu grace period.

    The problem here is that even though cfqd is around, there are no
    gurantees that associated request queue (td->queue) or q->queue_lock
    is still around. A driver might have called blk_cleanup_queue() and
    release the lock.

    It might happen that after freeing up the lock we call
    blkg->key->queue->queue_ock and crash. This is possible in following
    path.

    blkiocg_destroy()
    blkio_unlink_group_fn()
    cfq_unlink_blkio_group()

    Hence, wait for an rcu peirod if there are groups which have not
    been unlinked from blkcg->blkg_list. That way, if there are any groups
    which are taking cfq_unlink_blkio_group() path, can safely take queue
    lock.

    This is how we have taken care of race in throttling logic also.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Nobody seems to be using cfq_find_alloc_cfqg() function parameter "create".
    Get rid of that.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • cgroup unaccounted_time file is created only if CONFIG_DEBUG_BLK_CGROUP=y.
    there are some fields which are out side this config option. Fix that.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Group initialization code seems to be at two places. root group
    initialization in blk_throtl_init() and dynamically allocated group
    in throtl_find_alloc_tg(). Create a common function and use at both
    the places.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • Since for-2.6.40/core was forked off the 2.6.39 devel tree, we've
    had churn in the core area that makes it difficult to handle
    patches for eg cfq or blk-throttle. Instead of requiring that they
    be based in older versions with bugs that have been fixed later
    in the rc cycle, merge in 2.6.39 final.

    Also fixes up conflicts in the below files.

    Conflicts:
    drivers/block/paride/pcd.c
    drivers/cdrom/viocd.c
    drivers/ide/ide-cd.c

    Signed-off-by: Jens Axboe

    Jens Axboe
     

19 May, 2011

7 commits

  • Linus Torvalds
     
  • * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    configfs: Fix race between configfs_readdir() and configfs_d_iput()
    configfs: Don't try to d_delete() negative dentries.
    ocfs2/dlm: Target node death during resource migration leads to thread spin
    ocfs2: Skip mount recovery for hard-ro mounts
    ocfs2/cluster: Heartbeat mismatch message improved
    ocfs2/cluster: Increase the live threshold for global heartbeat
    ocfs2/dlm: Use negotiated o2dlm protocol version
    ocfs2: skip existing hole when removing the last extent_rec in punching-hole codes.
    ocfs2: Initialize data_ac (might be used uninitialized)

    Linus Torvalds
     
  • * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
    drivercore: revert addition of of_match to struct device
    of: fix race when matching drivers

    Linus Torvalds
     
  • * 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus:
    MIPS: Kludge IP27 build for 2.6.39.
    MIPS: AR7: Fix GPIO register size for Titan variant.
    MIPS: Fix duplicate invocation of notify_die.
    MIPS: RB532: Fix iomap resource size miscalculation.

    Linus Torvalds
     
  • Commit b826291c, "drivercore/dt: add a match table pointer to struct
    device" added an of_match pointer to struct device to cache the
    of_match_table entry discovered at driver match time. This was unsafe
    because matching is not an atomic operation with probing a driver. If
    two or more drivers are attempted to be matched to a driver at the
    same time, then the cached matching entry pointer could get
    overwritten.

    This patch reverts the of_match cache pointer and reworks all users to
    call of_match_device() directly instead.

    Signed-off-by: Grant Likely

    Grant Likely
     
  • blk_cleanup_queue() calls elevator_exit() and after this, we can't
    touch the elevator without oopsing. __elv_next_request() must check
    for this state because in the refcounted queue model, we can still
    call it after blk_cleanup_queue() has been called.

    This was reported as causing an oops attributable to scsi.

    Signed-off-by: James Bottomley
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    James Bottomley
     
  • If two drivers are probing devices at the same time, both will write
    their match table result to the dev->of_match cache at the same time.

    Only write the result if the device matches.

    In a thread titled "SBus devices sometimes detected, sometimes not",
    Meelis reported his SBus hme was not detected about 50% of the time.
    From the debug suggested by Grant it was obvious another driver matched
    some devices between the call to match the hme and the hme discovery
    failling.

    Reported-by: Meelis Roos
    Signed-off-by: Milton Miller
    [grant.likely: modified to only call of_match_device() once]
    Signed-off-by: Grant Likely

    Milton Miller
     

18 May, 2011

13 commits

  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    block: don't delay blk_run_queue_async
    scsi: remove performance regression due to async queue run
    blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup
    block: rescan partitions on invalidated devices on -ENOMEDIA too
    cdrom: always check_disk_change() on open
    block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers

    Linus Torvalds
     
  • Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • The 'size' variable contains the correct register size for both AR7
    and Titan, but we never used it to ioremap the correct register size.
    This problem only shows up on Titan.

    [ralf@linux-mips.org: Fixed the fix. The original patch as in patchwork
    recognizes the problem correctly then fails to fix it ...]

    Reported-by: Alexander Clouter
    Signed-off-by: Florian Fainelli
    Patchwork: https://patchwork.linux-mips.org/patch/2380/
    Signed-off-by: Ralf Baechle

    Florian Fainelli
     
  • Initial patch by Yury Polyanskiy .

    Signed-off-by: Ralf Baechle
    Patchwork: https://patchwork.linux-mips.org/patch/2373/

    Ralf Baechle
     
  • This is the MIPS portion of Joe Perches 's
    https://patchwork.linux-mips.org/patch/2172/ which seems to have been
    lost in time and space.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • configfs_readdir() will use the existing inode numbers of inodes in the
    dcache, but it makes them up for attribute files that aren't currently
    instantiated. There is a race where a closing attribute file can be
    tearing down at the same time as configfs_readdir() is trying to get its
    inode number.

    We want to get the inode number of open attribute files, because they
    should match while instantiated. We can't lock down the transition
    where dentry->d_inode is set to NULL, so we just check for NULL there.
    We can, however, ensure that an inode we find isn't iput() in
    configfs_d_iput() until after we've accessed it.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • When configfs is faking mkdir() on its subsystem or default group
    objects, it starts by adding a negative dentry. It then tries to
    instantiate the group. If that should fail, it must clean up after
    itself.

    I was using d_delete() here, but configfs_attach_group() promises to
    return an empty dentry on error. d_delete() explodes with the entry
    dentry. Let's try d_drop() instead. The unhashing is what we want for
    our dentry.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • Let's check a scenario:
    1. blk_delay_queue(q, SCSI_QUEUE_DELAY);
    2. blk_run_queue_async();
    the second one will became a noop, because q->delay_work already has
    WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after
    SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed
    work runs immediately.

    Fix this by doing a cancel on potentially pending delayed work
    before queuing an immediate run of the workqueue.

    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe

    Shaohua Li
     
  • * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
    [media] V4L: soc-camera: regression fix: calculate .sizeimage in soc_camera.c
    [media] v4l2-subdev: fix broken subdev control enumeration
    [media] Fix cx88 remote control input
    [media] v4l: Release module if subdev registration fails

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, AMD: Fix ARAT feature setting again
    Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors"
    x86, apic: Fix spurious error interrupts triggering on all non-boot APs
    x86, mce, AMD: Fix leaving freed data in a list
    x86: Fix UV BAU for non-consecutive nasids
    x86, UV: Fix NMI handler for UV platforms

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf evlist: Fix per thread mmap setup
    perf tools: Honour the cpu list parameter when also monitoring a thread list
    kprobes, x86: Disable irqs during optimized callback

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    cifs: fix cifsConvertToUCS() for the mapchars case
    cifs: add fallback in is_path_accessible for old servers

    Linus Torvalds
     
  • Provide a stub for proc_mkdir_mode() when CONFIG_PROC_FS is not
    enabled, just like the stub for proc_mkdir().

    Fixes this linux-next build error:

    drivers/net/wireless/airo.c:4504: error: implicit declaration of function 'proc_mkdir_mode'

    Signed-off-by: Randy Dunlap
    Cc: Stephen Rothwell
    Cc: Alexey Dobriyan
    Cc: "John W. Linville"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap