22 Jan, 2014

1 commit

  • Jakub Zawadzki noticed that some divisions by reciprocal_divide()
    were not correct [1][2], which he could also show with BPF code
    after divisions are transformed into reciprocal_value() for runtime
    invariance which can be passed to reciprocal_divide() later on;
    reverse in BPF dump ended up with a different, off-by-one K in
    some situations.

    This has been fixed by Eric Dumazet in commit aee636c4809fa5
    ("bpf: do not use reciprocal divide"). This follow-up patch
    improves reciprocal_value() and reciprocal_divide() to work in
    all cases by using Granlund and Montgomery method, so that also
    future use is safe and without any non-obvious side-effects.
    Known problems with the old implementation were that division by 1
    always returned 0 and some off-by-ones when the dividend and divisor
    where very large. This seemed to not be problematic with its
    current users, as far as we can tell. Eric Dumazet checked for
    the slab usage, we cannot surely say so in the case of flex_array.
    Still, in order to fix that, we propose an extension from the
    original implementation from commit 6a2d7a955d8d resp. [3][4],
    by using the algorithm proposed in "Division by Invariant Integers
    Using Multiplication" [5], Torbjörn Granlund and Peter L.
    Montgomery, that is, pseudocode for q = n/d where q, n, d is in
    u32 universe:

    1) Initialization:

    int l = ceil(log_2 d)
    uword m' = floor((1<<
    Cc: Eric Dumazet
    Cc: Austin S Hemmelgarn
    Cc: linux-kernel@vger.kernel.org
    Cc: Jesse Gross
    Cc: Jamal Hadi Salim
    Cc: Stephen Hemminger
    Cc: Matt Mackall
    Cc: Pekka Enberg
    Cc: Christoph Lameter
    Cc: Andy Gospodarek
    Cc: Veaceslav Falico
    Cc: Jay Vosburgh
    Cc: Jakub Zawadzki
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

27 May, 2011

1 commit

  • On most architectures division is an expensive operation and accessing an
    element currently requires four of them. This performance penalty
    effectively precludes flex arrays from being used on any kind of fast
    path. However, two of these divisions can be handled at creation time and
    the others can be replaced by a reciprocal divide, completely avoiding
    real divisions on access.

    [eparis@redhat.com: rebase on top of changes to support 0 len elements]
    [eparis@redhat.com: initialize part_nr when array fits entirely in base]
    Signed-off-by: Jesse Gross
    Signed-off-by: Eric Paris
    Cc: Dave Hansen
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesse Gross
     

29 Apr, 2011

1 commit

  • Change flex_array_prealloc to take the number of elements for which space
    should be allocated instead of the last (inclusive) element. Users
    and documentation are updated accordingly. flex_arrays got introduced before
    they had users. When folks started using it, they ended up needing a
    different API than was coded up originally. This swaps over to the API that
    folks apparently need.

    Based-on-patch-by: Steffen Klassert
    Signed-off-by: Eric Paris
    Tested-by: Chris Richards
    Acked-by: Dave Hansen
    Cc: stable@kernel.org [2.6.38+]

    Eric Paris
     

01 Dec, 2010

1 commit


10 Aug, 2010

1 commit

  • Getting and putting arrays of pointers with flex arrays is a PITA. You
    have to remember to pass &ptr to the _put and you have to do weird and
    wacky casting to get the ptr back from the _get. Add two functions
    flex_array_get_ptr() and flex_array_put_ptr() to handle all of the magic.

    [akpm@linux-foundation.org: simplification suggested by Joe]
    Signed-off-by: Eric Paris
    Cc: David Rientjes
    Cc: Dave Hansen
    Cc: Joe Perches
    Cc: James Morris
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Paris
     

22 Sep, 2009

3 commits

  • FLEX_ARRAY_INIT(element_size, total_nr_elements) cannot determine if
    either parameter is valid, so flex arrays which are statically allocated
    with this interface can easily become corrupted or reference beyond its
    allocated memory.

    This removes FLEX_ARRAY_INIT() as a struct flex_array initializer since no
    initializer may perform the required checking. Instead, the array is now
    defined with a new interface:

    DEFINE_FLEX_ARRAY(name, element_size, total_nr_elements)

    This may be prefixed with `static' for file scope.

    This interface includes compile-time checking of the parameters to ensure
    they are valid. Since the validity of both element_size and
    total_nr_elements depend on FLEX_ARRAY_BASE_SIZE and FLEX_ARRAY_PART_SIZE,
    the kernel build will fail if either of these predefined values changes
    such that the array parameters are no longer valid.

    Since BUILD_BUG_ON() requires compile time constants, several of the
    static inline functions that were once local to lib/flex_array.c had to be
    moved to include/linux/flex_array.h.

    Signed-off-by: David Rientjes
    Acked-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Add a new function to the flex_array API:

    int flex_array_shrink(struct flex_array *fa)

    This function will free all unused second-level pages. Since elements are
    now poisoned if they are not allocated with __GFP_ZERO, it's possible to
    identify parts that consist solely of unused elements.

    flex_array_shrink() returns the number of pages freed.

    Signed-off-by: David Rientjes
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Add a new function to the flex_array API:

    int flex_array_clear(struct flex_array *fa,
    unsigned int element_nr)

    This function will zero the element at element_nr in the flex_array.

    Although this is equivalent to using flex_array_put() and passing a
    pointer to zero'd memory, flex_array_clear() does not require such a
    pointer to memory that would most likely need to be allocated on the
    caller's stack which could be significantly large depending on
    element_size.

    Signed-off-by: David Rientjes
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

27 Aug, 2009

2 commits

  • It's problematic to allow signed element_nr's or total's to be passed as
    part of the flex array API.

    flex_array_alloc() allows total_nr_elements to be set to a negative
    quantity, which is obviously erroneous.

    flex_array_get() and flex_array_put() allows negative array indices in
    dereferencing an array part, which could address memory mapped before
    struct flex_array.

    The fix is to convert all existing element_nr formals to be qualified as
    unsigned. Existing checks to compare it to total_nr_elements or the max
    array size based on element_size need not be changed.

    Signed-off-by: David Rientjes
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • The `parts' member of struct flex_array should evaluate to an incomplete
    type so that sizeof() cannot be used and C99 does not require the
    zero-length specification.

    Signed-off-by: David Rientjes
    Acked-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

30 Jul, 2009

1 commit

  • Once a structure goes over PAGE_SIZE*2, we see occasional allocation
    failures. Some people have chosen to switch over to things like vmalloc()
    that will let them keep array-like access to such a large structures.
    But, vmalloc() has plenty of downsides.

    Here's an alternative. I think it's what Andrew was suggesting here:

    http://lkml.org/lkml/2009/7/2/518

    I call it a flexible array. It does all of its work in PAGE_SIZE bits, so
    never does an order>0 allocation. The base level has
    PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level.
    So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total
    storage when the objects pack nicely into a page. It is half that on
    64-bit because the pointers are twice the size. There's a table detailing
    this in the code.

    There are kerneldocs for the functions, but here's an
    overview:

    flex_array_alloc() - dynamically allocate a base structure
    flex_array_free() - free the array and all of the
    second-level pages
    flex_array_free_parts() - free the second-level pages, but
    not the base (for static bases)
    flex_array_put() - copy into the array at the given index
    flex_array_get() - copy out of the array at the given index
    flex_array_prealloc() - preallocate the second-level pages
    between the given indexes to
    guarantee no allocs will occur at
    put() time.

    We could also potentially just pass the "element_size" into each of the
    API functions instead of storing it internally. That would get us one
    more base pointer on 32-bit.

    I've been testing this by running it in userspace. The header and patch
    that I've been using are here, as well as the little script I'm using to
    generate the size table which goes in the kerneldocs.

    http://sr71.net/~dave/linux/flexarray/

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Dave Hansen
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen