06 Oct, 2012

1 commit

  • Premit use of another algorithm than the default first-fit one. For
    example a custom algorithm could be used to manage alignment requirements.

    As I can't predict all the possible requirements/needs for all allocation
    uses cases, I add a "free" field 'void *data' to pass any needed
    information to the allocation function. For example 'data' could be used
    to handle a structure where you store the alignment, the expected memory
    bank, the requester device, or any information that could influence the
    allocation algorithm.

    An usage example may look like this:
    struct my_pool_constraints {
    int align;
    int bank;
    ...
    };

    unsigned long my_custom_algo(unsigned long *map, unsigned long size,
    unsigned long start, unsigned int nr, void *data)
    {
    struct my_pool_constraints *constraints = data;
    ...
    deal with allocation contraints
    ...
    return the index in bitmap where perform the allocation
    }

    void create_my_pool()
    {
    struct my_pool_constraints c;
    struct gen_pool *pool = gen_pool_create(...);
    gen_pool_add(pool, ...);
    gen_pool_set_algo(pool, my_custom_algo, &c);
    }

    Add of best-fit algorithm function:
    most of the time best-fit is slower then first-fit but memory fragmentation
    is lower. The random buffer allocation/free tests don't show any arithmetic
    relation between the allocation time and fragmentation but the
    best-fit algorithm
    is sometime able to perform the allocation when the first-fit can't.

    This new algorithm help to remove static allocations on ESRAM, a small but
    fast on-chip RAM of few KB, used for high-performance uses cases like DMA
    linked lists, graphic accelerators, encoders/decoders. On the Ux500
    (in the ARM tree) we have define 5 ESRAM banks of 128 KB each and use of
    static allocations becomes unmaintainable:
    cd arch/arm/mach-ux500 && grep -r ESRAM .
    ./include/mach/db8500-regs.h:/* Base address and bank offsets for ESRAM */
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BASE 0x40000000
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK_SIZE 0x00020000
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK0 U8500_ESRAM_BASE
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK1 (U8500_ESRAM_BASE + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK2 (U8500_ESRAM_BANK1 + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK3 (U8500_ESRAM_BANK2 + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK4 (U8500_ESRAM_BANK3 + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_DMA_LCPA_OFFSET 0x10000
    ./include/mach/db8500-regs.h:#define U8500_DMA_LCPA_BASE
    (U8500_ESRAM_BANK0 + U8500_ESRAM_DMA_LCPA_OFFSET)
    ./include/mach/db8500-regs.h:#define U8500_DMA_LCLA_BASE U8500_ESRAM_BANK4

    I want to use genalloc to do dynamic allocations but I need to be able to
    fine tune the allocation algorithm. I my case best-fit algorithm give
    better results than first-fit, but it will not be true for every use case.

    Signed-off-by: Benjamin Gaignard
    Cc: Huang Ying
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Gaignard
     

03 Aug, 2011

1 commit

  • This version of the gen_pool memory allocator supports lockless
    operation.

    This makes it safe to use in NMI handlers and other special
    unblockable contexts that could otherwise deadlock on locks. This is
    implemented by using atomic operations and retries on any conflicts.
    The disadvantage is that there may be livelocks in extreme cases. For
    better scalability, one gen_pool allocator can be used for each CPU.

    The lockless operation only works if there is enough memory available.
    If new memory is added to the pool a lock has to be still taken. So
    any user relying on locklessness has to ensure that sufficient memory
    is preallocated.

    The basic atomic operation of this allocator is cmpxchg on long. On
    architectures that don't have NMI-safe cmpxchg implementation, the
    allocator can NOT be used in NMI handler. So code uses the allocator
    in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.

    Signed-off-by: Huang Ying
    Reviewed-by: Andi Kleen
    Reviewed-by: Mathieu Desnoyers
    Cc: Andrew Morton
    Signed-off-by: Len Brown

    Huang Ying
     

25 May, 2011

2 commits


02 Oct, 2006

1 commit

  • Modules using the genpool allocator need to be able to destroy the data
    structure when unloading.

    Signed-off-by: Steve Wise
    Cc: Randy Dunlap
    Cc: Dean Nelson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steve Wise
     

23 Jun, 2006

1 commit

  • Modify the gen_pool allocator (lib/genalloc.c) to utilize a bitmap scheme
    instead of the buddy scheme. The purpose of this change is to eliminate
    the touching of the actual memory being allocated.

    Since the change modifies the interface, a change to the uncached allocator
    (arch/ia64/kernel/uncached.c) is also required.

    Both Andrey Volkov and Jes Sorenson have expressed a desire that the
    gen_pool allocator not write to the memory being managed. See the
    following:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113518602713125&w=2
    http://marc.theaimsgroup.com/?l=linux-kernel&m=113533568827916&w=2

    Signed-off-by: Dean Nelson
    Cc: Andrey Volkov
    Acked-by: Jes Sorensen
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dean Nelson
     

22 Jun, 2005

1 commit

  • This patch contains the ia64 uncached page allocator and the generic
    allocator (genalloc). The uncached allocator was formerly part of the SN2
    mspec driver but there are several other users of it so it has been split
    off from the driver.

    The generic allocator can be used by device driver to manage special memory
    etc. The generic allocator is based on the allocator from the sym53c8xx_2
    driver.

    Various users on ia64 needs uncached memory. The SGI SN architecture requires
    it for inter-partition communication between partitions within a large NUMA
    cluster. The specific user for this is the XPC code. Another application is
    large MPI style applications which use it for synchronization, on SN this can
    be done using special 'fetchop' operations but it also benefits non SN
    hardware which may use regular uncached memory for this purpose. Performance
    of doing this through uncached vs cached memory is pretty substantial. This
    is handled by the mspec driver which I will push out in a seperate patch.

    Rather than creating a specific allocator for just uncached memory I came up
    with genalloc which is a generic purpose allocator that can be used by device
    drivers and other subsystems as they please. For instance to handle onboard
    device memory. It was derived from the sym53c7xx_2 driver's allocator which
    is also an example of a potential user (I am refraining from modifying sym2
    right now as it seems to have been under fairly heavy development recently).

    On ia64 memory has various properties within a granule, ie. it isn't safe to
    access memory as uncached within the same granule as currently has memory
    accessed in cached mode. The regular system therefore doesn't utilize memory
    in the lower granules which is mixed in with device PAL code etc. The
    uncached driver walks the EFI memmap and pulls out the spill uncached pages
    and sticks them into the uncached pool. Only after these chunks have been
    utilized, will it start converting regular cached memory into uncached memory.
    Hence the reason for the EFI related code additions.

    Signed-off-by: Jes Sorensen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jes Sorensen