30 Apr, 2013

1 commit

  • As Tejun points out, there are several users of the IDR facility that
    attempt to use it in a cyclic fashion. These users are likely to see
    -ENOSPC errors after the counter wraps one or more times however.

    This patchset adds a new idr_alloc_cyclic routine and converts several
    of these users to it. Many of these users are in obscure parts of the
    kernel, and I don't have a good way to test some of them. The change is
    pretty straightforward though, so hopefully it won't be an issue.

    There is one other cyclic user of idr_alloc that I didn't touch in
    ipc/util.c. That one is doing some strange stuff that I didn't quite
    understand, but it looks like it should probably be converted later
    somehow.

    This patch:

    Thus spake Tejun Heo:

    Ooh, BTW, the cyclic allocation is broken. It's prone to -ENOSPC
    after the first wraparound. There are several cyclic users in the
    kernel and I think it probably would be best to implement cyclic
    support in idr.

    This patch does that by adding new idr_alloc_cyclic function that such
    users in the kernel can use. With this, there's no need for a caller to
    keep track of the last value used as that's now tracked internally. This
    should prevent the ENOSPC problems that can hit when the "last allocated"
    counter exceeds INT_MAX.

    Later patches will convert existing cyclic users to the new interface.

    Signed-off-by: Jeff Layton
    Reviewed-by: Tejun Heo
    Cc: "David S. Miller"
    Cc: "J. Bruce Fields"
    Cc: Eric Paris
    Cc: Jack Morgenstein
    Cc: John McCutchan
    Cc: Neil Horman
    Cc: Or Gerlitz
    Cc: Robert Love
    Cc: Roland Dreier
    Cc: Sridhar Samudrala
    Cc: Steve Wise
    Cc: Tom Tucker
    Cc: Vlad Yasevich

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Layton
     

14 Mar, 2013

2 commits

  • GFP_NOIO is often used for idr_alloc() inside preloaded section as the
    allocation mask doesn't really matter. If the idr tree needs to be
    expanded, idr_alloc() first tries to allocate using the specified
    allocation mask and if it fails falls back to the preloaded buffer. This
    order prevent non-preloading idr_alloc() users from taking advantage of
    preloading ones by using preload buffer without filling it shifting the
    burden of allocation to the preload users.

    Unfortunately, this allowed/expected-to-fail kmem_cache allocation ends up
    generating spurious slab lowmem warning before succeeding the request from
    the preload buffer.

    This patch makes idr_layer_alloc() add __GFP_NOWARN to the first
    kmem_cache attempt and try kmem_cache again w/o __GFP_NOWARN after
    allocation from preload_buffer fails so that lowmem warning is generated
    if not suppressed by the original @gfp_mask.

    Signed-off-by: Tejun Heo
    Reported-by: David Teigland
    Tested-by: David Teigland
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Now that all in-kernel users are converted to ues the new alloc
    interface, mark the old interface deprecated. We should be able to
    remove these in a few releases.

    Signed-off-by: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

13 Mar, 2013

1 commit

  • Fix new kernel-doc warnings in idr:

    Warning(include/linux/idr.h:113): No description found for parameter 'idr'
    Warning(include/linux/idr.h:113): Excess function parameter 'idp' description in 'idr_find'
    Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc'
    Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc'

    Signed-off-by: Randy Dunlap
    Acked-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

09 Mar, 2013

1 commit

  • idr_find(), idr_remove() and idr_replace() used to silently ignore the
    sign bit and perform lookup with the rest of the bits. The weird behavior
    has been changed such that negative IDs are treated as invalid. As the
    behavior change was subtle, WARN_ON_ONCE() was added in the hope of
    determining who's calling idr functions with negative IDs so that they can
    be examined for problems.

    Up until now, all two reported cases are ID number coming directly from
    userland and getting fed into idr_find() and the warnings seem to cause
    more problems than being helpful. Drop the WARN_ON_ONCE()s.

    Signed-off-by: Tejun Heo
    Reported-by:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

28 Feb, 2013

13 commits

  • Until recently, when an negative ID is specified, idr functions used to
    ignore the sign bit and proceeded with the operation with the rest of
    bits, which is bizarre and error-prone. The behavior recently got changed
    so that negative IDs are treated as invalid but we're triggering
    WARN_ON_ONCE() on negative IDs just in case somebody was depending on the
    sign bit being ignored, so that those can be detected and fixed easily.

    We only need this for a while. Explain why WARN_ON_ONCE()s are there and
    that they can be removed later.

    Signed-off-by: Tejun Heo
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • While idr lookup isn't a particularly heavy operation, it still is too
    substantial to use in hot paths without worrying about the performance
    implications. With recent changes, each idr_layer covers 256 slots
    which should be enough to cover most use cases with single idr_layer
    making lookup hint very attractive.

    This patch adds idr->hint which points to the idr_layer which
    allocated an ID most recently and the fast path lookup becomes

    if (look up target's prefix matches that of the hinted layer)
    return hint->ary[ID's offset in the leaf layer];

    which can be inlined.

    idr->hint is set to the leaf node on idr_fill_slot() and cleared from
    free_layer().

    [andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized]
    Signed-off-by: Tejun Heo
    Cc: Kirill A. Shutemov
    Cc: Sasha Levin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Add a field which carries the prefix of ID the idr_layer covers. This
    will be used to implement lookup hint.

    This patch doesn't make use of the new field and doesn't introduce any
    behavior difference.

    Signed-off-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Currently, idr->bitmap is declared as an unsigned long which restricts
    the number of bits an idr_layer can contain. All bitops can handle
    arbitrary positive integer bit number and there's no reason for this
    restriction.

    Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single
    unsigned long.

    * idr_layer->bitmap is now an array. '&' dropped from params to
    bitops.

    * Replaced "== IDR_FULL" tests with bitmap_full() and removed
    IDR_FULL.

    * Replaced find_next_bit() on ~bitmap with find_next_zero_bit().

    * Replaced "bitmap = 0" with bitmap_clear().

    This patch doesn't (or at least shouldn't) introduce any behavior
    changes.

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • MAX_IDR_MASK is another weirdness in the idr interface. As idr covers
    whole positive integer range, it's defined as 0x7fffffff or INT_MAX.

    Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
    They basically mask off the sign bit and operate on the rest, so if
    the caller, by accident, passes in a negative number, the sign bit
    will be masked off and the remaining part will be used as if that was
    the input, which is worse than crashing.

    The constant is visible in idr.h and there are several users in the
    kernel.

    * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()

    Basically used to test if adap->nr is a negative number which isn't
    -1 and returns -EINVAL if so. idr_alloc() already has negative
    @start checking (w/ WARN_ON_ONCE), so this can go away.

    * drivers/infiniband/core/cm.c:cm_alloc_id()
    drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()

    Used to wrap cyclic @start. Can be replaced with max(next, 0).
    Note that this type of cyclic allocation using idr is buggy. These
    are prone to spurious -ENOSPC failure after the first wraparound.

    * fs/super.c:get_anon_bdev()

    The ID allocated from ida is masked off before being tested whether
    it's inside valid range. ida allocated ID can never be a negative
    number and the masking is unnecessary.

    Update idr_*() functions to fail with -EINVAL when negative @id is
    specified and update other MAX_IDR_MASK users as described above.

    This leaves MAX_IDR_MASK without any user, remove it and relocate
    other MAX_IDR_* constants to lib/idr.c.

    Signed-off-by: Tejun Heo
    Cc: Jean Delvare
    Cc: Roland Dreier
    Cc: Sean Hefty
    Cc: Hal Rosenstock
    Cc: "Marciniszyn, Mike"
    Cc: Jack Morgenstein
    Cc: Or Gerlitz
    Cc: Al Viro
    Acked-by: Wolfram Sang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Most functions in idr fail to deal with the high bits when the idr
    tree grows to the maximum height.

    * idr_get_empty_slot() stops growing idr tree once the depth reaches
    MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to
    cover the whole range. The function doesn't even notice that it
    didn't grow the tree enough and ends up allocating the wrong ID
    given sufficiently high @starting_id.

    For example, on 64 bit, if the starting id is 0x7fffff01,
    idr_get_empty_slot() will grow the tree 5 layer deep, which only
    covers the 30 bits and then proceed to allocate as if the bit 30
    wasn't specified. It ends up allocating 0x3fffff01 without the bit
    30 but still returns 0x7fffff01.

    * __idr_remove_all() will not remove anything if the tree is fully
    grown.

    * idr_find() can't find anything if the tree is fully grown.

    * idr_for_each() and idr_get_next() can't iterate anything if the tree
    is fully grown.

    Fix it by introducing idr_max() which returns the maximum possible ID
    given the depth of tree and replacing the id limit checks in all
    affected places.

    As the idr_layer pointer array pa[] needs to be 1 larger than the
    maximum depth, enlarge pa[] arrays by one.

    While this plugs the discovered issues, the whole code base is
    horrible and in desparate need of rewrite. It's fragile like hell,

    Signed-off-by: Tejun Heo
    Cc: Rusty Russell
    Cc:

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • The current idr interface is very cumbersome.

    * For all allocations, two function calls - idr_pre_get() and
    idr_get_new*() - should be made.

    * idr_pre_get() doesn't guarantee that the following idr_get_new*()
    will not fail from memory shortage. If idr_get_new*() returns
    -EAGAIN, the caller is expected to retry pre_get and allocation.

    * idr_get_new*() can't enforce upper limit. Upper limit can only be
    enforced by allocating and then freeing if above limit.

    * idr_layer buffer is unnecessarily per-idr. Each idr ends up keeping
    around MAX_IDR_FREE idr_layers. The memory consumed per idr is
    under two pages but it makes it difficult to make idr_layer larger.

    This patch implements the following new set of allocation functions.

    * idr_preload[_end]() - Similar to radix preload but doesn't fail.
    The first idr_alloc() inside preload section can be treated as if it
    were called with @gfp_mask used for idr_preload().

    * idr_alloc() - Allocate an ID w/ lower and upper limits. Takes
    @gfp_flags and can be used w/o preloading. When used inside
    preloaded section, the allocation mask of preloading can be assumed.

    If idr_alloc() can be called from a context which allows sufficiently
    relaxed @gfp_mask, it can be used by itself. If, for example,
    idr_alloc() is called inside spinlock protected region, preloading can
    be used like the following.

    idr_preload(GFP_KERNEL);
    spin_lock(lock);

    id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT);

    spin_unlock(lock);
    idr_preload_end();
    if (id < 0)
    error;

    which is much simpler and less error-prone than idr_pre_get and
    idr_get_new*() loop.

    The new interface uses per-pcu idr_layer buffer and thus the number of
    idr's in the system doesn't affect the amount of memory used for
    preloading.

    idr_layer_alloc() is introduced to handle idr_layer allocations for
    both old and new ID allocation paths. This is a bit hairy now but the
    new interface is expected to replace the old and the internal
    implementation eventually will become simpler.

    Signed-off-by: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Move slot filling to idr_fill_slot() from idr_get_new_above_int() and
    make idr_get_new_above() directly call it. idr_get_new_above_int() is
    no longer needed and removed.

    This will be used to implement a new ID allocation interface.

    Signed-off-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate
    exception conditions internally. The return value is later translated
    to errno values using _idr_rc_to_errno().

    This is confusing. Drop the custom ones and consistently use -EAGAIN
    for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for
    "ran out of ID space".

    Due to the weird memory preloading mechanism, [ra]_get_new*() return
    -EAGAIN on memory shortage, so we need to substitute -ENOMEM w/
    -EAGAIN on those interface functions. They'll eventually be cleaned
    up and the translations will go away.

    This patch doesn't introduce any functional changes.

    Signed-off-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • * Move idr_for_each_entry() definition next to other idr related
    definitions.

    * Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above().

    This changes the implementation of idr_get_new() but the new
    implementation is trivial. This patch doesn't introduce any
    functional change.

    Signed-off-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • There was only one legitimate use of idr_remove_all() and a lot more of
    incorrect uses (or lack of it). Now that idr_destroy() implies
    idr_remove_all() and all the in-kernel users updated not to use it,
    there's no reason to keep it around. Mark it deprecated so that we can
    later unexport it.

    idr_remove_all() is made an inline function calling __idr_remove_all()
    to avoid triggering deprecated warning on EXPORT_SYMBOL().

    Signed-off-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • idr is silly in quite a few ways, one of which is how it's supposed to
    be destroyed - idr_destroy() doesn't release IDs and doesn't even whine
    if the idr isn't empty. If the caller forgets idr_remove_all(), it
    simply leaks memory.

    Even ida gets this wrong and leaks memory on destruction. There is
    absoltely no reason not to call idr_remove_all() from idr_destroy().
    Nobody is abusing idr_destroy() for shrinking free layer buffer and
    continues to use idr after idr_destroy(), so it's safe to do remove_all
    from destroy.

    In the whole kernel, there is only one place where idr_remove_all() is
    legitimiately used without following idr_destroy() while there are quite
    a few places where the caller forgets either idr_remove_all() or
    idr_destroy() leaking memory.

    This patch makes idr_destroy() call idr_destroy_all() and updates the
    function description accordingly.

    Signed-off-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • The iteration logic of idr_get_next() is borrowed mostly verbatim from
    idr_for_each(). It walks down the tree looking for the slot matching
    the current ID. If the matching slot is not found, the ID is
    incremented by the distance of single slot at the given level and
    repeats.

    The implementation assumes that during the whole iteration id is aligned
    to the layer boundaries of the level closest to the leaf, which is true
    for all iterations starting from zero or an existing element and thus is
    fine for idr_for_each().

    However, idr_get_next() may be given any point and if the starting id
    hits in the middle of a non-existent layer, increment to the next layer
    will end up skipping the same offset into it. For example, an IDR with
    IDs filled between [64, 127] would look like the following.

    [ 0 64 ... ]
    /----/ |
    | |
    NULL [ 64 ... 127 ]

    If idr_get_next() is called with 63 as the starting point, it will try
    to follow down the pointer from 0. As it is NULL, it will then try to
    proceed to the next slot in the same level by adding the slot distance
    at that level which is 64 - making the next try 127. It goes around the
    loop and finds and returns 127 skipping [64, 126].

    Note that this bug also triggers in idr_for_each_entry() loop which
    deletes during iteration as deletions can make layers go away leaving
    the iteration with unaligned ID into missing layers.

    Fix it by ensuring proceeding to the next slot doesn't carry over the
    unaligned offset - ie. use round_up(id + 1, slot_distance) instead of
    id += slot_distance.

    Signed-off-by: Tejun Heo
    Reported-by: David Teigland
    Cc: KAMEZAWA Hiroyuki
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

06 Oct, 2012

1 commit

  • To avoid name conflicts:

    drivers/video/riva/fbdev.c:281:9: sparse: preprocessor token MAX_LEVEL redefined

    While at it, also make the other names more consistent and add
    parentheses.

    [akpm@linux-foundation.org: repair fallout]
    [sfr@canb.auug.org.au: IB/mlx4: fix for MAX_ID_MASK to MAX_IDR_MASK name change]
    Signed-off-by: Fengguang Wu
    Cc: Bernd Petrovitsch
    Cc: walter harms
    Cc: Glauber Costa
    Signed-off-by: Stephen Rothwell
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     

25 Mar, 2012

1 commit

  • Pull cleanup of fs/ and lib/ users of module.h from Paul Gortmaker:
    "Fix up files in fs/ and lib/ dirs to only use module.h if they really
    need it.

    These are trivial in scope vs the work done previously. We now have
    things where any few remaining cleanups can be farmed out to arch or
    subsystem maintainers, and I have done so when possible. What is
    remaining here represents the bits that don't clearly lie within a
    single arch/subsystem boundary, like the fs dir and the lib dir.

    Some duplicate includes arising from overlapping fixes from
    independent subsystem maintainer submissions are also quashed."

    Fix up trivial conflicts due to clashes with other include file cleanups
    (including some due to the previous bug.h cleanup pull).

    * tag 'module-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
    lib: reduce the use of module.h wherever possible
    fs: reduce the use of module.h wherever possible
    includecheck: delete any duplicate instances of module.h

    Linus Torvalds
     

22 Mar, 2012

1 commit

  • Make one small adjustment to idr_get_next(): take the height from the top
    layer (stable under RCU) instead of from the root (unprotected by RCU), as
    idr_find() does: so that it can be used with RCU locking. Copied comment
    on RCU locking from idr_find().

    Signed-off-by: Hugh Dickins
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Li Zefan
    Cc: Eric Dumazet
    Acked-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

08 Mar, 2012

1 commit


03 Nov, 2011

1 commit

  • It's often convenient to be able to release resource from IRQ context.
    Make ida_simple_*() use irqsave/restore spin ops so that they are IRQ
    safe.

    Signed-off-by: Tejun Heo
    Acked-by: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

01 Nov, 2011

1 commit


15 Sep, 2011

1 commit


04 Aug, 2011

2 commits


27 Oct, 2010

2 commits

  • Add idr/ida to kernel-api docbook.
    Fix typos and kernel-doc notation.

    Signed-off-by: Randy Dunlap
    Acked-by: Tejun Heo
    Cc: Naohiro Aota
    Cc: Jiri Kosina
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Despite the idr_pre_get() kernel-doc, there are some cases where you can
    call idr_pre_get() from within locked regions. Add a description for such
    cases.

    See also: http://lkml.org/lkml/2010/9/16/462

    [akpm@linux-foundation.org: cleanups, grammatical fixes]
    Signed-off-by: Naohiro Aota
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naohiro Aota
     

31 Aug, 2010

2 commits

  • It was unclear in original kernel-doc how nextidp worked in
    idr_get_next(). Let's describe it.

    Signed-off-by: Naohiro Aota
    Acked-by: Tejun Heo
    Signed-off-by: Jiri Kosina

    Naohiro Aota
     
  • Fix the following kernel-doc warnings.

    % perl scripts/kernel-doc lib/idr.c > /dev/null
    Warning(lib/idr.c:300): No description found for parameter 'starting_id'
    Warning(lib/idr.c:300): Excess function parameter 'start_id' description in 'idr_get_new_above'
    Warning(lib/idr.c:485): No description found for parameter 'idp'
    Warning(lib/idr.c:596): No description found for parameter 'nextidp'
    Warning(lib/idr.c:596): Excess function parameter 'id' description in 'idr_get_next'
    Warning(lib/idr.c:774): No description found for parameter 'starting_id'
    Warning(lib/idr.c:774): Excess function parameter 'staring_id' description in 'ida_get_new_above'
    Warning(lib/idr.c:918): No description found for parameter 'ida'

    Signed-off-by: Naohiro Aota
    Acked-by: Tejun Heo
    Signed-off-by: Jiri Kosina

    Naohiro Aota
     

23 Jun, 2010

1 commit


28 May, 2010

1 commit

  • Currently idr_remove_all will fail with a use after free error if
    idr::layers is bigger than 2, which on 32 bit systems corresponds to items
    more than 1024. This is due to stepping back too many levels during
    backtracking. For simplicity let's assume that IDR_BITS=1 -> we have 2
    nodes at each level below the root node and each leaf node stores two IDs.
    (In reality for 32 bit systems IDR_BITS=5, with 32 nodes at each sub-root
    level and 32 IDs in each leaf node). The sequence of freeing the nodes at
    the moment is as follows:

    layer
    1 -> a(7)
    2 -> b(3) c(5)
    3 -> d(1) e(2) f(4) g(6)

    Until step 4 things go fine, but then node c is freed, whereas node g
    should be freed first. Since node c contains the pointer to node g we'll
    have a use after free error at step 6.

    How many levels we step back after visiting the leaf nodes is currently
    determined by the msb of the id we are currently visiting:

    Step
    1. node d with IDs 0,1 is freed, current ID is advanced to 2.
    msb of the current ID bit 1. This means we need to step back
    1 level to node b and take the next sibling, node e.
    2-3. node e with IDs 2,3 is freed, current ID is 4, msb is bit 2.
    This means we need to step back 2 levels to node a, freeing
    node b on the way.
    4-5. node f with IDs 4,5 is freed, current ID is 6, msb is still
    bit 2. This means we again need to step back 2 levels to node
    a and free c on the way.
    6. We should visit node g, but its pointer is not available as
    node c was freed.

    The fix changes how we determine the number of levels to step back.
    Instead of deducting this merely from the msb of the current ID, we should
    really check if advancing the ID causes an overflow to a bit position
    corresponding to a given layer. In the above example overflow from bit 0
    to bit 1 should mean stepping back 1 level. Overflow from bit 1 to bit 2
    should mean stepping back 2 levels and so on.

    The fix was tested with IDs up to 1 << 20, which corresponds to 4 layers
    on 32 bit systems.

    Signed-off-by: Imre Deak
    Reviewed-by: Tejun Heo
    Cc: Eric Paris
    Cc: "Paul E. McKenney"
    Cc: [2.6.34.1]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Imre Deak
     

26 Mar, 2010

1 commit


27 Feb, 2010

1 commit


25 Feb, 2010

2 commits

  • idr_get_next() was accidentally not exported when added. It is about
    to be used by mtdcore, which may be built as a module.

    Signed-off-by: Ben Hutchings
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: David Woodhouse

    Ben Hutchings
     
  • Because idr can be used with any of a number of locks or with
    any flavor of RCU, just disable the lockdep-based diagnostics.
    If idr needs diagnostics, the check expression will need to be
    passed into the relevant idr primitives as an additional
    argument.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

23 Feb, 2010

1 commit

  • This is retry of reverted 859ddf09743a8cc680af33f7259ccd0fd36bfe9d
    ("idr: fix a critical misallocation bug") which contained two bugs.

    * pa[idp->layers] should be cleared even if it's not used by
    sub_alloc() because it's used by mark idr_mark_full().

    * The original condition check also assigned pa[l] to p which the new
    code didn't do thus leaving p pointing at the wrong layer.

    Both problems have been fixed and the idr code has received good amount
    testing using userland testing setup where simple bitmap allocator is
    run parallel to verify the result of idr allocation.

    The bug this patch fixes is caused by sub_alloc() optimization path
    bypassing out-of-room condition check and restarting allocation loop
    with starting value higher than maximum allowed value. For detailed
    description, please read commit message of 859ddf09.

    Signed-off-by: Tejun Heo
    Based-on-patch-from: Eric Paris
    Reported-by: Eric Paris
    Tested-by: Stefan Lippers-Hollmann
    Tested-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

05 Feb, 2010

1 commit

  • Commit 859ddf09743a8cc680af33f7259ccd0fd36bfe9d tried to fix
    misallocation bug but broke full bit marking by not clearing
    pa[idp->layers] and also is causing X failures due to lookup failure
    in drm code. The cause of the latter hasn't been found yet. Revert
    the fix for now.

    Signed-off-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

03 Feb, 2010

1 commit

  • Eric Paris located a bug in idr. With IDR_BITS of 6, it grows to three
    layers when id 4096 is first allocated. When that happens, idr wraps
    incorrectly and searches the idr array ignoring the high bits. The
    following test code from Eric demonstrates the bug nicely.

    #include
    #include
    #include

    static DEFINE_IDR(test_idr);

    int init_module(void)
    {
    int ret, forty95, forty96;
    void *addr;

    /* add 2 entries both with 4095 as the start address */
    again1:
    if (!idr_pre_get(&test_idr, GFP_KERNEL))
    return -ENOMEM;
    ret = idr_get_new_above(&test_idr, (void *)4095, 4095, &forty95);
    if (ret) {
    if (ret == -EAGAIN)
    goto again1;
    return ret;
    }
    if (forty95 != 4095)
    printk(KERN_ERR "hmmm, forty95=%d\n", forty95);

    again2:
    if (!idr_pre_get(&test_idr, GFP_KERNEL))
    return -ENOMEM;
    ret = idr_get_new_above(&test_idr, (void *)4096, 4095, &forty96);
    if (ret) {
    if (ret == -EAGAIN)
    goto again2;
    return ret;
    }
    if (forty96 != 4096)
    printk(KERN_ERR "hmmm, forty96=%d\n", forty96);

    /* try to find the 2 entries, noticing that 4096 broke */
    addr = idr_find(&test_idr, forty95);
    if ((int)addr != forty95)
    printk(KERN_ERR "hmmm, after find forty95=%d addr=%d\n", forty95, (int)addr);
    addr = idr_find(&test_idr, forty96);
    if ((int)addr != forty96)
    printk(KERN_ERR "hmmm, after find forty96=%d addr=%d\n", forty96, (int)addr);
    /* really weird, the entry which should be at 4096 is actually at 0!! */
    addr = idr_find(&test_idr, 0);
    if ((int)addr)
    printk(KERN_ERR "found an entry at id=0 for addr=%d\n", (int)addr);

    idr_remove(&test_idr, forty95);
    idr_remove(&test_idr, forty96);

    return 0;
    }

    void cleanup_module(void)
    {
    }

    MODULE_AUTHOR("Eric Paris ");
    MODULE_DESCRIPTION("Simple idr test");
    MODULE_LICENSE("GPL");

    This happens because when sub_alloc() back tracks it doesn't always do it
    step-by-step while the over-the-limit detection assumes step-by-step
    backtracking. The logic in sub_alloc() looks like the following.

    restart:
    clear pa[top level + 1] for end cond detection
    l = top level
    while (true) {
    search for empty slot at this level
    if (not found) {
    push id to the next possible value
    l++
    A: if (pa[l] is clear)
    failed, return asking caller to grow the tree
    if (going up 1 level gives more slots to search)
    continue the while loop above with the incremented l
    else
    C: goto restart
    }
    adjust id accordingly to the found slot
    if (l == 0)
    return found id;
    create lower level if not there yet
    record pa[l] and l--
    }

    Test A is the fail exit condition but this assumes that failure is
    propagated upwared one level at a time but the B optimization path breaks
    the assumption and restarts the whole thing with a start value which is
    above the possible limit with the current layers. sub_alloc() assumes the
    start id value is inside the limit when called and test A is the only exit
    condition check, so it ends up searching for empty slot while ignoring
    high set bit.

    So, for 4095->4096 test, level0 search fails but pa[1] contains a valid
    pointer. However, going up 1 level wouldn't give any more empty slot so
    it takes C and when the whole thing restarts nobody notices the high bit
    set beyond the top level.

    This patch fixes the bug by changing the fail exit condition check to full
    id limit check.

    Based-on-patch-from: Eric Paris
    Reported-by: Eric Paris
    Signed-off-by: Tejun Heo
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo