22 Aug, 2012

1 commit

  • Now that mod_delayed_work() is safe to call from IRQ handlers,
    __cancel_delayed_work() followed by queue_delayed_work() can be
    replaced with mod_delayed_work().

    Most conversions are straight-forward except for the following.

    * net/core/link_watch.c: linkwatch_schedule_work() was doing a quite
    elaborate dancing around its delayed_work. Collapse it such that
    linkwatch_work is queued for immediate execution if LW_URGENT and
    existing timer is kept otherwise.

    Signed-off-by: Tejun Heo
    Cc: "David S. Miller"
    Cc: Tomi Valkeinen

    Tejun Heo
     

21 Aug, 2012

1 commit

  • system_nrt[_freezable]_wq are now spurious. Mark them deprecated and
    convert all users to system[_freezable]_wq.

    If you're cc'd and wondering what's going on: Now all workqueues are
    non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
    Please use system[_freezable]_wq instead.

    This patch doesn't make any functional difference.

    Signed-off-by: Tejun Heo
    Acked-By: Lai Jiangshan

    Cc: Jens Axboe
    Cc: David Airlie
    Cc: Jiri Kosina
    Cc: "David S. Miller"
    Cc: Rusty Russell
    Cc: "Paul E. McKenney"
    Cc: David Howells

    Tejun Heo
     

25 Jun, 2012

1 commit

  • Block layer very lazy allocation of ioc. It waits until the moment
    ioc is absolutely necessary; unfortunately, that time could be inside
    queue lock and __get_request() performs unlock - try alloc - retry
    dancing.

    Just allocate it up-front on entry to block layer. We're not saving
    the rain forest by deferring it to the last possible moment and
    complicating things unnecessarily.

    This patch is to prepare for further updates to request allocation
    path.

    Signed-off-by: Tejun Heo
    Acked-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     

23 May, 2012

1 commit

  • tg_stats_alloc_lock nests inside queue lock and should always be held
    with irq disabled. throtl_pd_{init|exit}() were using non-irqsafe
    spinlock ops which triggered inverse lock ordering via irq warning via
    RCU freeing of blkg invoking throtl_pd_exit() w/o disabling IRQ.

    Update both functions to use irq safe operations.

    Signed-off-by: Tejun Heo
    Reported-by: Sasha Levin
    LKML-Reference:
    Signed-off-by: Jens Axboe

    Tejun Heo
     

01 May, 2012

1 commit


20 Apr, 2012

10 commits

  • There's no reason to keep blkcg_policy_ops separate. Collapse it into
    blkcg_policy.

    This patch doesn't introduce any functional change.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently blkg_policy_data carries policy specific data as char flex
    array instead of being embedded in policy specific data. This was
    forced by oddities around blkg allocation which are all gone now.

    This patch makes blkg_policy_data embedded in policy specific data -
    throtl_grp and cfq_group so that it's more conventional and consistent
    with how io_cq is handled.

    * blkcg_policy->pdata_size is renamed to ->pd_size.

    * Functions which used to take void *pdata now takes struct
    blkg_policy_data *pd.

    * blkg_to_pdata/pdata_to_blkg() updated to blkg_to_pd/pd_to_blkg().

    * Dummy struct blkg_policy_data definition added. Dummy
    pdata_to_blkg() definition was unused and inconsistent with the
    non-dummy version - correct dummy pd_to_blkg() added.

    * throtl and cfq updated accordingly.

    * As dummy blkg_to_pd/pd_to_blkg() are provided,
    blkg_to_cfqg/cfqg_to_blkg() don't need to be ifdef'd. Moved outside
    ifdef block.

    This patch doesn't introduce any functional change.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • During the recent blkcg cleanup, most of blkcg API has changed to such
    extent that mass renaming wouldn't cause any noticeable pain. Take
    the chance and cleanup the naming.

    * Rename blkio_cgroup to blkcg.

    * Drop blkio / blkiocg prefixes and consistently use blkcg.

    * Rename blkio_group to blkcg_gq, which is consistent with io_cq but
    keep the blkg prefix / variable name.

    * Rename policy method type and field names to signify they're dealing
    with policy data.

    * Rename blkio_policy_type to blkcg_policy.

    This patch doesn't cause any functional change.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • blkio_group->path[] stores the path of the associated cgroup and is
    used only for debug messages. Just format the path from blkg->cgroup
    when printing debug messages.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • * All_q_list is unused. Drop all_q_{mutex|list}.

    * @for_root of blkg_lookup_create() is always %false when called from
    outside blk-cgroup.c proper. Factor out __blkg_lookup_create() so
    that it doesn't check whether @q is bypassing and use the
    underscored version for the @for_root callsite.

    * blkg_destroy_all() is used only from blkcg proper and @destroy_root
    is always %true. Make it static and drop @destroy_root.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • All blkcg policies were assumed to be enabled on all request_queues.
    Due to various implementation obstacles, during the recent blkcg core
    updates, this was temporarily implemented as shooting down all !root
    blkgs on elevator switch and policy [de]registration combined with
    half-broken in-place root blkg updates. In addition to being buggy
    and racy, this meant losing all blkcg configurations across those
    events.

    Now that blkcg is cleaned up enough, this patch replaces the temporary
    implementation with proper per-queue policy activation. Each blkcg
    policy should call the new blkcg_[de]activate_policy() to enable and
    disable the policy on a specific queue. blkcg_activate_policy()
    allocates and installs policy data for the policy for all existing
    blkgs. blkcg_deactivate_policy() does the reverse. If a policy is
    not enabled for a given queue, blkg printing / config functions skip
    the respective blkg for the queue.

    blkcg_activate_policy() also takes care of root blkg creation, and
    cfq_init_queue() and blk_throtl_init() are updated accordingly.

    This replaces blkcg_bypass_{start|end}() and update_root_blkg_pd()
    unnecessary. Dropped.

    v2: cfq_init_queue() was returning uninitialized @ret on root_group
    alloc failure if !CONFIG_CFQ_GROUP_IOSCHED. Fixed.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • With per-queue policy activation, root blkg creation will be moved to
    blkcg core. Add q->root_blkg in preparation. For blk-throtl, this
    replaces throtl_data->root_tg; however, cfq needs to keep
    cfqd->root_group for !CONFIG_CFQ_GROUP_IOSCHED.

    This is to prepare for per-queue policy activation and doesn't cause
    any functional difference.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Add @pol to blkg_conf_prep() and let it return with queue lock held
    (to be released by blkg_conf_finish()). Note that @pol isn't used
    yet.

    This is to prepare for per-queue policy activation and doesn't cause
    any visible difference.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Remove BLKIO_POLICY_* enums and let blkio_policy_register() allocate
    @pol->plid dynamically on registration. The maximum number of blkcg
    policies which can be registered at the same time is defined by
    BLKCG_MAX_POLS constant added to include/linux/blkdev.h.

    Note that blkio_policy_register() now may fail. Policy init functions
    updated accordingly and unnecessary ifdefs removed from cfq_init().

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • The two functions were taking "enum blkio_policy_id plid". Make them
    take "const struct blkio_policy_type *pol" instead.

    This is to prepare for per-queue policy activation and doesn't cause
    any functional difference.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     

02 Apr, 2012

8 commits

  • Now that all stat handling code lives in policy implementations,
    there's no need to encode policy ID in cft->private.

    * Export blkcg_prfill_[rw]stat() from blkcg, remove
    blkcg_print_[rw]stat(), and implement cfqg_print_[rw]stat() which
    use hard-code BLKIO_POLICY_PROP.

    * Use cft->private for offset of the target field directly and drop
    BLKCG_STAT_{PRIV|POL|OFF}().

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • Now that all conf and stat fields are moved into policy specific
    blkio_policy_data->pdata areas, there's no reason to use
    blkio_policy_data itself in prfill functions. Pass around @pd->pdata
    instead of @pd.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • blkio_cgroup_conf->iops and ->bps are owned by blk-throttle and has no
    reason to be defined in blkcg core. Drop them and let conf setting
    functions directly manipulate throtl_grp->bps[] and ->iops[].

    This makes blkio_group_conf empty. Drop it.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • blkio_group_stats_cpu is used only by blk-throtl and has no reason to
    be defined in blkcg core.

    * Move blkio_group_stats_cpu to blk-throttle.c and rename it to
    tg_stats_cpu.

    * blkg_policy_data->stats_cpu is replaced with throtl_grp->stats_cpu.
    prfill functions updated accordingly.

    * All related macros / functions are renamed so that they have tg_
    prefix and the unnecessary @pol arguments are dropped.

    * Per-cpu stats allocation code is also moved from blk-cgroup.c to
    blk-throttle.c and gets simplified to only deal with
    BLKIO_POLICY_THROTL. percpu stat free is performed by the exit
    method throtl_exit_blkio_group().

    * throtl_reset_group_stats() implemented for
    blkio_reset_group_stats_fn method so that tg->stats_cpu can be
    reset.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • blkio_group_stats_cpu is used to count dispatch stats using per-cpu
    counters. This is used by both blk-throtl and cfq-iosched but the
    sharing is rather silly.

    * cfq-iosched doesn't need per-cpu dispatch stats. cfq always updates
    those stats while holding queue_lock.

    * blk-throtl needs per-cpu dispatch stats but only service_bytes and
    serviced. It doesn't make use of sectors.

    This patch makes cfq add and use global stats for service_bytes,
    serviced and sectors, removes per-cpu sectors counter and moves
    per-cpu stat printing code to blk-throttle.c.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • As with conf/stats file handling code, there's no reason for stat
    update code to live in blkcg core with policies calling into update
    them. The current organization is both inflexible and complex.

    This patch moves stat update code to specific policies. All
    blkiocg_update_*_stats() functions which deal with BLKIO_POLICY_PROP
    stats are collapsed into their cfq_blkiocg_update_*_stats()
    counterparts. blkiocg_update_dispatch_stats() is used by both
    policies and duplicated as throtl_update_dispatch_stats() and
    cfq_blkiocg_update_dispatch_stats(). This will be cleaned up later.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • blkcg conf/stat handling is convoluted in that details which belong to
    specific policy implementations are all out in blkcg core and then
    policies hook into core layer to access and manipulate confs and
    stats. This sadly achieves both inflexibility (confs/stats can't be
    modified without messing with blkcg core) and complexity (all the
    call-ins and call-backs).

    The previous patches restructured conf and stat handling code such
    that they can be separated out. This patch relocates the file
    handling part. All conf/stat file handling code which belongs to
    BLKIO_POLICY_PROP is moved to cfq-iosched.c and all
    BKLIO_POLICY_THROTL code to blk-throtl.c.

    The move is verbatim except for blkio_update_group_{weight|bps|iops}()
    callbacks which relays conf changes to policies. The configuration
    settings are handled in policies themselves so the relaying isn't
    necessary. Conf setting functions are modified to directly call
    per-policy update functions and the relaying mechanism is dropped.

    Signed-off-by: Tejun Heo

    Tejun Heo
     
  • @pol to blkg_to_pdata() and @plid to blkg_lookup_create() are no
    longer necessary. Drop them.

    Signed-off-by: Tejun Heo

    Tejun Heo
     

30 Mar, 2012

1 commit


07 Mar, 2012

16 commits

  • Make blk-throttle call bio_associate_current() on bios being delayed
    such that they get issued to block layer with the original io_context.
    This allows stacking blk-throttle and cfq-iosched propio policies.
    bios will always be issued with the correct ioc and blkcg whether it
    gets delayed by blk-throttle or not.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Implement bio_blkio_cgroup() which returns the blkcg associated with
    the bio if exists or %current's blkcg, and use it in blk-throttle and
    cfq-iosched propio. This makes both cgroup policies honor task
    association for the bio instead of always assuming %current.

    As nobody is using bio_set_task() yet, this doesn't introduce any
    behavior change.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Now that blkg additions / removals are always done under both q and
    blkcg locks, the only places RCU locking is necessary are
    blkg_lookup[_create]() for lookup w/o blkcg lock. This patch drops
    unncessary RCU locking replacing it with plain blkcg locking as
    necessary.

    * blkiocg_pre_destroy() already perform proper locking and don't need
    RCU. Dropped.

    * blkio_read_blkg_stats() now uses blkcg->lock instead of RCU read
    lock. This isn't a hot path.

    * Now unnecessary synchronize_rcu() from queue exit paths removed.
    This makes q->nr_blkgs unnecessary. Dropped.

    * RCU annotation on blkg->q removed.

    -v2: Vivek pointed out that blkg_lookup_create() still needs to be
    called under rcu_read_lock(). Updated.

    -v3: After the update, stats_lock locking in blkio_read_blkg_stats()
    shouldn't be using _irq variant as it otherwise ends up enabling
    irq while blkcg->lock is locked. Fixed.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently, blkg is per cgroup-queue-policy combination. This is
    unnatural and leads to various convolutions in partially used
    duplicate fields in blkg, config / stat access, and general management
    of blkgs.

    This patch make blkg's per cgroup-queue and let them serve all
    policies. blkgs are now created and destroyed by blkcg core proper.
    This will allow further consolidation of common management logic into
    blkcg core and API with better defined semantics and layering.

    As a transitional step to untangle blkg management, elvswitch and
    policy [de]registration, all blkgs except the root blkg are being shot
    down during elvswitch and bypass. This patch adds blkg_root_update()
    to update root blkg in place on policy change. This is hacky and racy
    but should be good enough as interim step until we get locking
    simplified and switch over to proper in-place update for all blkgs.

    -v2: Root blkgs need to be updated on elvswitch too and blkg_alloc()
    comment wasn't updated according to the function change. Fixed.
    Both pointed out by Vivek.

    -v3: v2 updated blkg_destroy_all() to invoke update_root_blkg_pd() for
    all policies. This freed root pd during elvswitch before the
    last queue finished exiting and led to oops. Directly invoke
    update_root_blkg_pd() only on BLKIO_POLICY_PROP from
    cfq_exit_queue(). This also is closer to what will be done with
    proper in-place blkg update. Reported by Vivek.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • With the previous patch to move blkg list heads and counters to
    request_queue and blkg, logic to manage them in both policies are
    almost identical and can be moved to blkcg core.

    This patch moves blkg link logic into blkg_lookup_create(), implements
    common blkg unlink code in blkg_destroy(), and updates
    blkg_destory_all() so that it's policy specific and can skip root
    group. The updated blkg_destroy_all() is now used to both clear queue
    for bypassing and elv switching, and release all blkgs on q exit.

    This patch introduces a race window where policy [de]registration may
    race against queue blkg clearing. This can only be a problem on cfq
    unload and shouldn't be a real problem in practice (and we have many
    other places where this race already exists). Future patches will
    remove these unlikely races.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently, specific policy implementations are responsible for
    maintaining list and number of blkgs. This duplicates code
    unnecessarily, and hinders factoring common code and providing blkcg
    API with better defined semantics.

    After this patch, request_queue hosts list heads and counters and blkg
    has list nodes for both policies. This patch only relocates the
    necessary fields and the next patch will actually move management code
    into blkcg core.

    Note that request_queue->blkg_list[] and ->nr_blkgs[] are hardcoded to
    have 2 elements. This is to avoid include dependency and will be
    removed by the next patch.

    This patch doesn't introduce any behavior change.

    -v2: Now unnecessary conditional on CONFIG_BLK_CGROUP_MODULE removed
    as pointed out by Vivek.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • blkg is scheduled to be unified for all policies and thus there won't
    be one-to-one mapping from blkg to policy. Update stat related
    functions to take explicit @pol or @plid arguments and not use
    blkg->plid.

    This is painful for now but most of specific stat interface functions
    will be replaced with a handful of generic helpers.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently, blkcg policy implementations manage blkg refcnt duplicating
    mostly identical code in both policies. This patch moves refcnt to
    blkg and let blkcg core handle refcnt and freeing of blkgs.

    * cfq blkgs now also get freed via RCU.

    * cfq blkgs lose RB_EMPTY_ROOT() sanity check on blkg free. If
    necessary, we can add blkio_exit_group_fn() to resurrect this.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently, blkg's are embedded in private data blkcg policy private
    data structure and thus allocated and freed by policies. This leads
    to duplicate codes in policies, hinders implementing common part in
    blkcg core with strong semantics, and forces duplicate blkg's for the
    same cgroup-q association.

    This patch introduces struct blkg_policy_data which is a separate data
    structure chained from blkg. Policies specifies the amount of private
    data it needs in its blkio_policy_type->pdata_size and blkcg core
    takes care of allocating them along with blkg which can be accessed
    using blkg_to_pdata(). blkg can be determined from pdata using
    pdata_to_blkg(). blkio_alloc_group_fn() method is accordingly updated
    to blkio_init_group_fn().

    For consistency, tg_of_blkg() and cfqg_of_blkg() are replaced with
    blkg_to_tg() and blkg_to_cfqg() respectively, and functions to map in
    the reverse direction are added.

    Except that policy specific data now lives in a separate data
    structure from blkg, this patch doesn't introduce any functional
    difference.

    This will be used to unify blkg's for different policies.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently block core calls directly into blk-throttle for init, drain
    and exit. This patch adds blkcg_{init|drain|exit}_queue() which wraps
    the blk-throttle functions. This is to give more control and
    visiblity to blkcg core layer for proper layering. Further patches
    will add logic common to blkcg policies to the functions.

    While at it, collapse blk_throtl_release() into blk_throtl_exit().
    There's no reason to keep them separate.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently, blkg points to the associated blkcg via its css_id. This
    unnecessarily complicates dereferencing blkcg. Let blkg hold a
    reference to the associated blkcg and point directly to it and disable
    css_id on blkio_subsys.

    This change requires splitting blkiocg_destroy() into
    blkiocg_pre_destroy() and blkiocg_destroy() so that all blkg's can be
    destroyed and all the blkcg references held by them dropped during
    cgroup removal.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • blkg->dev is dev_t recording the device number of the block device for
    the associated request_queue. It is used to identify the associated
    block device when printing out configuration or stats.

    This is redundant to begin with. A blkg is an association between a
    cgroup and a request_queue and it of course is possible to reach
    request_queue from blkg and synchronization conventions are in place
    for safe q dereferencing, so this shouldn't be necessary from the
    beginning. Furthermore, it's initialized by sscanf()ing the device
    name of backing_dev_info. The mind boggles.

    Anyways, if blkg is visible under rcu lock, we *know* that the
    associated request_queue hasn't gone away yet and its bdi is
    registered and alive - blkg can't be created for request_queue which
    hasn't been fully initialized and it can't go away before blkg is
    removed.

    Let stat and conf read functions get device name from
    blkg->q->backing_dev_info.dev and pass it down to printing functions
    and remove blkg->dev.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • blkcg is very peculiar in that it allows setting and remembering
    configurations for non-existent devices by maintaining separate data
    structures for configuration.

    This behavior is completely out of the usual norms and outright
    confusing; furthermore, it uses dev_t number to match the
    configuration to devices, which is unpredictable to begin with and
    becomes completely unuseable if EXT_DEVT is fully used.

    It is wholely unnecessary - we already have fully functional userland
    mechanism to program devices being hotplugged which has full access to
    device identification, connection topology and filesystem information.

    Add a new struct blkio_group_conf which contains all blkcg
    configurations to blkio_group and let blkio_group, which can be
    created iff the associated device exists and is removed when the
    associated device goes away, carry all configurations.

    Note that, after this patch, all newly created blkg's will always have
    the default configuration (unlimited for throttling and blkcg's weight
    for propio).

    This patch makes blkio_policy_node meaningless but doesn't remove it.
    The next patch will.

    -v2: Updated to retry after short sleep if blkg lookup/creation failed
    due to the queue being temporarily bypassed as indicated by
    -EBUSY return. Pointed out by Vivek.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Cc: Kay Sievers
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Currently both blk-throttle and cfq-iosched implement their own
    blkio_group creation code in throtl_get_tg() and cfq_get_cfqg(). This
    patch factors out the common code into blkg_lookup_create(), which
    returns ERR_PTR value so that transitional failures due to queue
    bypass can be distinguished from other failures.

    * New plkio_policy_ops methods blkio_alloc_group_fn() and
    blkio_link_group_fn added. Both are transitional and will be
    removed once the blkg management code is fully moved into
    blk-cgroup.c.

    * blkio_alloc_group_fn() allocates policy-specific blkg which is
    usually a larger data structure with blkg as the first entry and
    intiailizes it. Note that initialization of blkg proper, including
    percpu stats, is responsibility of blk-cgroup proper.

    Note that default config (weight, bps...) initialization is done
    from this method; otherwise, we end up violating locking order
    between blkcg and q locks via blkcg_get_CONF() functions.

    * blkio_link_group_fn() is called under queue_lock and responsible for
    linking the blkg to the queue. blkcg side is handled by blk-cgroup
    proper.

    * The common blkg creation function is named blkg_lookup_create() and
    blkiocg_lookup_group() is renamed to blkg_lookup() for consistency.
    Also, throtl / cfq related functions are similarly [re]named for
    consistency.

    This simplifies blkcg policy implementations and enables further
    cleanup.

    -v2: Vivek noticed that blkg_lookup_create() incorrectly tested
    blk_queue_dead() instead of blk_queue_bypass() leading a user of
    the function ending up creating a new blkg on bypassing queue.
    This is a bug introduced while relocating bypass patches before
    this one. Fixed.

    -v3: ERR_PTR patch folded into this one. @for_root added to
    blkg_lookup_create() to allow creating root group on a bypassed
    queue during elevator switch.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • For root blkg, blk_throtl_init() was using throtl_alloc_tg()
    explicitly and cfq_init_queue() was manually initializing embedded
    cfqd->root_group, adding unnecessarily different code paths to blkg
    handling.

    Make both use the usual blkio_group get functions - throtl_get_tg()
    and cfq_get_cfqg() - for the root blkio_group too. Note that
    blk_throtl_init() callsite is pushed downwards in
    blk_alloc_queue_node() so that @q is sufficiently initialized for
    throtl_get_tg().

    This simplifies root blkg handling noticeably for cfq and will allow
    further modularization of blkcg API.

    -v2: Vivek pointed out that using cfq_get_cfqg() won't work if
    CONFIG_CFQ_GROUP_IOSCHED is disabled. Fix it by factoring out
    initialization of base part of cfqg into cfq_init_cfqg_base() and
    alloc/init/free explicitly if !CONFIG_CFQ_GROUP_IOSCHED.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • blkgio_group is association between a block cgroup and a queue for a
    given policy. Using opaque void * for association makes things
    confusing and hinders factoring of common code. Use request_queue *
    and, if necessary, policy id instead.

    This will help block cgroup API cleanup.

    Signed-off-by: Tejun Heo
    Cc: Vivek Goyal
    Signed-off-by: Jens Axboe

    Tejun Heo