27 Oct, 2010

6 commits

  • Improve 'lib_sort()' test and check that:
    o 'cmp()' is called only for elements which were present in the original list,
    i.e., the 'a' and 'b' parameters are valid
    o the resulted (sorted) list consists onlly of the original elements
    o intdoruce "poison" fields to make sure data around 'struc list_head' field
    are not corrupted.

    Signed-off-by: Artem Bityutskiy
    Cc: Don Mullis
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Artem Bityutskiy
     
  • This patch unifies 'list_sort_test()' messages a bit and makes sure all of
    them start with the "list_sort_test:" prefix to make it obvious for users
    where the messages come from.

    Signed-off-by: Artem Bityutskiy
    Cc: Don Mullis
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Artem Bityutskiy
     
  • The 'lib_sort()' test does not free memory if it fails, and it makes the
    kernel panic if it cannot allocate memory. This patch fixes the problem.

    This patch also changes several small things:
    o use 'list_add()' helper instead of adding manually
    o introduce temporary 'el1' variable to avoid ugly and unreadalbe
    "if" statement
    o make 'head' to be stack variable instead of 'kmalloc()'ed, which
    simplifies code a bit

    Overall, this patch is of clean-up type.

    Signed-off-by: Artem Bityutskiy
    Cc: Don Mullis
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Artem Bityutskiy
     
  • Instead of using own pseudo-random generator, use generic linux
    'random32()' function. Presumably, this should improve test coverage.

    At the same time, do the following changes:
    o Use shorter macro name for test list length
    o Do not use strange 'l_h' name for 'struct list_head' element,
    use 'list', because it is traditional name and thus, makes the
    code more obvious and readable.

    Signed-off-by: Artem Bityutskiy
    Cc: Don Mullis
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Artem Bityutskiy
     
  • I do not see any reason to use KERN_WARN for normal messages and
    KERN_EMERG for error messages in the lib_sort testing routine. Let's use
    more reasonable KERN_NORM and KERN_ERR levels.

    Signed-off-by: Artem Bityutskiy
    Cc: Don Mullis
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Artem Bityutskiy
     
  • While hunting a non-existing bug in 'list_sort()', I've improved the
    'list_sort_test()' function which tests the 'list_sort()' library call.
    Although at the end I found a bug in my code, but not in 'list_sort()', I
    think my clean-ups and improvements are worth merging because they make
    the test function better.

    This patch:

    Make the self-tests selectable via Kconfig rather than by manual enabling
    of DEBUG_LIST_SORT.

    Signed-off-by: Artem Bityutskiy
    Cc: Don Mullis
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Artem Bityutskiy
     

02 Oct, 2010

1 commit

  • If the original list is a POT in length, the first callback from line 73
    will pass a==b both pointing to the original list_head. This is dangerous
    because the 'list_sort()' user can use 'container_of()' and accesses the
    "containing" object, which does not necessary exist for the list head. So
    the user can access RAM which does not belong to him. If this is a write
    access, we can end up with memory corruption.

    Signed-off-by: Don Mullis
    Tested-by: Artem Bityutskiy
    Signed-off-by: Artem Bityutskiy
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Don Mullis
     

07 Mar, 2010

2 commits

  • Clarify and correct header comment of list_sort().

    Signed-off-by: Don Mullis
    Cc: Dave Airlie
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Artem Bityutskiy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Don Mullis
     
  • XFS and UBIFS can pass long lists to list_sort(); this alternative
    implementation scales better, reaching ~3x performance gain when list
    length exceeds the L2 cache size.

    Stand-alone program timings were run on a Core 2 duo L1=32KB L2=4MB,
    gcc-4.4, with flags extracted from an Ubuntu kernel build. Object size is
    581 bytes compared to 455 for Mark J. Roberts' code.

    Worst case for either implementation is a list length just over a power of
    two, and to roughly the same degree, so here are timing results for a
    range of 2^N+1 lengths. List elements were 16 bytes each including malloc
    overhead; initial order was random.

    time (msec)
    Tatham-Roberts
    | generic-Mullis-v2
    loop_count length | | ratio
    4000000 2 206 294 1.427
    2000000 3 176 227 1.289
    1000000 5 199 172 0.864
    500000 9 235 178 0.757
    250000 17 243 182 0.748
    125000 33 261 196 0.750
    62500 65 277 209 0.754
    31250 129 292 219 0.75
    15625 257 317 235 0.741
    7812 513 340 252 0.741
    3906 1025 362 267 0.737
    1953 2049 388 283 0.729 ~ L1 size
    976 4097 556 323 0.580
    488 8193 678 361 0.532
    244 16385 773 395 0.510
    122 32769 844 418 0.495
    61 65537 917 454 0.495
    30 131073 1128 543 0.481
    15 262145 2355 869 0.369 ~ L2 size
    7 524289 5597 1714 0.306
    3 1048577 6218 2022 0.325

    Mark's code does not actually implement the usual or generic mergesort,
    but rather a variant from Simon Tatham described here:

    http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html

    Simon's algorithm performs O(log N) passes over the entire input list,
    doing merges of sublists that double in size on each pass. The generic
    algorithm instead merges pairs of equal length lists as early as possible,
    in recursive order. For either algorithm, the elements that extend the
    list beyond power-of-two length are a special case, handled as nearly as
    possible as a "rounding-up" to a full POT.

    Some intuition for the locality of reference implications of merge order
    may be gotten by watching this animation:

    http://www.sorting-algorithms.com/merge-sort

    Simon's algorithm requires only O(1) extra space rather than the generic
    algorithm's O(log N), but in my non-recursive implementation the actual
    O(log N) data is merely a vector of ~20 pointers, which I've put on the
    stack.

    Long-running list_sort() calls: If the list passed in may be long, or the
    client's cmp() callback function is slow, the client's cmp() may
    periodically invoke cond_resched() to voluntarily yield the CPU. All
    inner loops of list_sort() call back to cmp().

    Stability of the sort: distinct elements that compare equal emerge from
    the sort in the same order as with Mark's code, for simple test cases. A
    boot-time test is provided to verify this and other correctness
    requirements.

    A kernel that uses drm.ko appears to run normally with this change; I have
    no suitable hardware to similarly test the use by UBIFS.

    [akpm@linux-foundation.org: style tweaks, fix comment, make list_sort_test __init]
    Signed-off-by: Don Mullis
    Cc: Dave Airlie
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Artem Bityutskiy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Don Mullis
     

13 Jan, 2010

1 commit

  • There are two copies of list_sort() in the tree already, one in the DRM
    code, another in ubifs. Now XFS needs this as well. Create a generic
    list_sort() function from the ubifs version and convert existing users
    to it so we don't end up with yet another copy in the tree.

    Signed-off-by: Dave Chinner
    Acked-by: Dave Airlie
    Acked-by: Artem Bityutskiy
    Signed-off-by: Linus Torvalds

    Dave Chinner