13 Jan, 2021

1 commit

  • [ Upstream commit 36845663843fc59c5d794e3dc0641472e3e572da ]

    Some graphic card has very big memory on chip, such as 32G bytes.

    In the following case, it will cause overflow:

    pool = gen_pool_create(PAGE_SHIFT, NUMA_NO_NODE);
    ret = gen_pool_add(pool, 0x1000000, SZ_32G, NUMA_NO_NODE);

    va = gen_pool_alloc(pool, SZ_4G);

    The overflow occurs in gen_pool_alloc_algo_owner():

    ....
    size = nbits << order;
    ....

    The @nbits is "int" type, so it will overflow.
    Then the gen_pool_avail() will return the wrong value.

    This patch converts some "int" to "unsigned long", and
    changes the compare code in while.

    Link: https://lkml.kernel.org/r/20201229060657.3389-1-sjhuang@iluvatar.ai
    Signed-off-by: Huang Shijie
    Reported-by: Shi Jiasheng
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Huang Shijie
     

05 Dec, 2019

2 commits

  • Follow the kernel conventions, rename addr_in_gen_pool to
    gen_pool_has_addr.

    [sjhuang@iluvatar.ai: fix Documentation/ too]
    Link: http://lkml.kernel.org/r/20181229015914.5573-1-sjhuang@iluvatar.ai
    Link: http://lkml.kernel.org/r/20181228083950.20398-1-sjhuang@iluvatar.ai
    Signed-off-by: Huang Shijie
    Reviewed-by: Andrew Morton
    Cc: Russell King
    Cc: Arnd Bergmann
    Cc: Greg Kroah-Hartman
    Cc: Christoph Hellwig
    Cc: Marek Szyprowski
    Cc: Robin Murphy
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Shijie
     
  • We use addr_in_gen_pool() in a driver module. So export it.

    Link: http://lkml.kernel.org/r/20181224070622.22197-2-sjhuang@iluvatar.ai
    Signed-off-by: Huang Shijie
    Reviewed-by: Andrew Morton
    Cc: Alexey Skidanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Shijie
     

07 Oct, 2019

1 commit

  • Commit 795ee30648c7 ("lib/genalloc: introduce chunk owners") made a number
    of changes to the genalloc API and implementation but did not update the
    documentation to match, leading to these docs build warnings:

    ./lib/genalloc.c:1: warning: 'gen_pool_add_virt' not found
    ./lib/genalloc.c:1: warning: 'gen_pool_alloc' not found
    ./lib/genalloc.c:1: warning: 'gen_pool_free' not found
    ./lib/genalloc.c:1: warning: 'gen_pool_alloc_algo' not found

    Fix these by updating the docs to match new function locations and names,
    and by completing the update of one kerneldoc comment.

    Fixes: 795ee30648c7 ("lib/genalloc: introduce chunk owners")
    Acked-by: Dan Williams
    Signed-off-by: Jonathan Corbet

    Jonathan Corbet
     

13 Jul, 2019

1 commit

  • Pull dma-mapping updates from Christoph Hellwig:

    - move the USB special case that bounced DMA through a device bar into
    the USB code instead of handling it in the common DMA code (Laurentiu
    Tudor and Fredrik Noring)

    - don't dip into the global CMA pool for single page allocations
    (Nicolin Chen)

    - fix a crash when allocating memory for the atomic pool failed during
    boot (Florian Fainelli)

    - move support for MIPS-style uncached segments to the common code and
    use that for MIPS and nios2 (me)

    - make support for DMA_ATTR_NON_CONSISTENT and
    DMA_ATTR_NO_KERNEL_MAPPING generic (me)

    - convert nds32 to the generic remapping allocator (me)

    * tag 'dma-mapping-5.3' of git://git.infradead.org/users/hch/dma-mapping: (29 commits)
    dma-mapping: mark dma_alloc_need_uncached as __always_inline
    MIPS: only select ARCH_HAS_UNCACHED_SEGMENT for non-coherent platforms
    usb: host: Fix excessive alignment restriction for local memory allocations
    lib/genalloc.c: Add algorithm, align and zeroed family of DMA allocators
    nios2: use the generic uncached segment support in dma-direct
    nds32: use the generic remapping allocator for coherent DMA allocations
    arc: use the generic remapping allocator for coherent DMA allocations
    dma-direct: handle DMA_ATTR_NO_KERNEL_MAPPING in common code
    dma-direct: handle DMA_ATTR_NON_CONSISTENT in common code
    dma-mapping: add a dma_alloc_need_uncached helper
    openrisc: remove the partial DMA_ATTR_NON_CONSISTENT support
    arc: remove the partial DMA_ATTR_NON_CONSISTENT support
    arm-nommu: remove the partial DMA_ATTR_NON_CONSISTENT support
    ARM: dma-mapping: allow larger DMA mask than supported
    dma-mapping: truncate dma masks to what dma_addr_t can hold
    iommu/dma: Apply dma_{alloc,free}_contiguous functions
    dma-remap: Avoid de-referencing NULL atomic_pool
    MIPS: use the generic uncached segment support in dma-direct
    dma-direct: provide generic support for uncached kernel segments
    au1100fb: fix DMA API abuse
    ...

    Linus Torvalds
     

28 Jun, 2019

1 commit


19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this source code is licensed under the gnu general public license
    version 2 see the file copying for more details

    this source code is licensed under general public license version 2
    see

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 52 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Allison Randal
    Reviewed-by: Alexios Zavras
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

14 Jun, 2019

1 commit

  • The p2pdma facility enables a provider to publish a pool of dma
    addresses for a consumer to allocate. A genpool is used internally by
    p2pdma to collect dma resources, 'chunks', to be handed out to
    consumers. Whenever a consumer allocates a resource it needs to pin the
    'struct dev_pagemap' instance that backs the chunk selected by
    pci_alloc_p2pmem().

    Currently that reference is taken globally on the entire provider
    device. That sets up a lifetime mismatch whereby the p2pdma core needs
    to maintain hacks to make sure the percpu_ref is not released twice.

    This lifetime mismatch also stands in the way of a fix to
    devm_memremap_pages() whereby devm_memremap_pages_release() must wait for
    the percpu_ref ->release() callback to complete before it can proceed to
    teardown pages.

    So, towards fixing this situation, introduce the ability to store a 'chunk
    owner' at gen_pool_add() time, and a facility to retrieve the owner at
    gen_pool_{alloc,free}() time. For p2pdma this will be used to store and
    recall individual dev_pagemap reference counter instances per-chunk.

    Link: http://lkml.kernel.org/r/155727338118.292046.13407378933221579644.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Dan Williams
    Reviewed-by: Ira Weiny
    Reviewed-by: Logan Gunthorpe
    Cc: Bjorn Helgaas
    Cc: "Jérôme Glisse"
    Cc: Christoph Hellwig
    Cc: Greg Kroah-Hartman
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Williams
     

03 Jun, 2019

1 commit


06 Jan, 2019

1 commit

  • Fixes build break on most ARM/ARM64 defconfigs:

    lib/genalloc.c: In function 'gen_pool_add_virt':
    lib/genalloc.c:190:10: error: implicit declaration of function 'vzalloc_node'; did you mean 'kzalloc_node'?
    lib/genalloc.c:190:8: warning: assignment to 'struct gen_pool_chunk *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
    lib/genalloc.c: In function 'gen_pool_destroy':
    lib/genalloc.c:254:3: error: implicit declaration of function 'vfree'; did you mean 'kfree'?

    Fixes: 6862d2fc8185 ('lib/genalloc.c: use vzalloc_node() to allocate the bitmap')
    Cc: Huang Shijie
    Cc: Andrew Morton
    Cc: Alexey Skidanov
    Signed-off-by: Olof Johansson
    Signed-off-by: Linus Torvalds

    Olof Johansson
     

05 Jan, 2019

2 commits

  • Some devices may have big memory on chip, such as over 1G. In some
    cases, the nbytes maybe bigger then 4M which is the bounday of the
    memory buddy system (4K default).

    So use vzalloc_node() to allocate the bitmap. Also use vfree to free
    it.

    Link: http://lkml.kernel.org/r/20181225015701.6289-1-sjhuang@iluvatar.ai
    Signed-off-by: Huang Shijie
    Reviewed-by: Andrew Morton
    Cc: Alexey Skidanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Shijie
     
  • gen_pool_alloc_algo() uses different allocation functions implementing
    different allocation algorithms. With gen_pool_first_fit_align()
    allocation function, the returned address should be aligned on the
    requested boundary.

    If chunk start address isn't aligned on the requested boundary, the
    returned address isn't aligned too. The only way to get properly
    aligned address is to initialize the pool with chunks aligned on the
    requested boundary. If want to have an ability to allocate buffers
    aligned on different boundaries (for example, 4K, 1MB, ...), the chunk
    start address should be aligned on the max possible alignment.

    This happens because gen_pool_first_fit_align() looks for properly
    aligned memory block without taking into account the chunk start address
    alignment.

    To fix this, we provide chunk start address to
    gen_pool_first_fit_align() and change its implementation such that it
    starts looking for properly aligned block with appropriate offset
    (exactly as is done in CMA).

    Link: https://lkml.kernel.org/lkml/a170cf65-6884-3592-1de9-4c235888cc8a@intel.com
    Link: http://lkml.kernel.org/r/1541690953-4623-1-git-send-email-alexey.skidanov@intel.com
    Signed-off-by: Alexey Skidanov
    Reviewed-by: Andrew Morton
    Cc: Logan Gunthorpe
    Cc: Daniel Mentz
    Cc: Mathieu Desnoyers
    Cc: Laura Abbott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Skidanov
     

18 Nov, 2017

1 commit

  • If the amount of resources allocated to a gen_pool exceeds 2^32 then the
    avail atomic overflows and this causes problems when clients try and
    borrow resources from the pool. This is only expected to be an issue on
    64 bit systems.

    Add the header to pull in atomic_long* operations. So
    that 32 bit systems continue to use atomic32_t but 64 bit systems can
    use atomic64_t.

    Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com
    Signed-off-by: Stephen Bates
    Reviewed-by: Logan Gunthorpe
    Reviewed-by: Mathieu Desnoyers
    Reviewed-by: Daniel Mentz
    Cc: Jonathan Corbet
    Cc: Andrew Morton
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Bates
     

28 Oct, 2016

1 commit

  • gen_pool_alloc_algo() iterates over the chunks of a pool trying to find
    a contiguous block of memory that satisfies the allocation request.

    The shortcut

    if (size > atomic_read(&chunk->avail))
    continue;

    makes the loop skip over chunks that do not have enough bytes left to
    fulfill the request. There are two situations, though, where an
    allocation might still fail:

    (1) The available memory is not contiguous, i.e. the request cannot
    be fulfilled due to external fragmentation.

    (2) A race condition. Another thread runs the same code concurrently
    and is quicker to grab the available memory.

    In those situations, the loop calls pool->algo() to search the entire
    chunk, and pool->algo() returns some value that is >= end_bit to
    indicate that the search failed. This return value is then assigned to
    start_bit. The variables start_bit and end_bit describe the range that
    should be searched, and this range should be reset for every chunk that
    is searched. Today, the code fails to reset start_bit to 0. As a
    result, prefixes of subsequent chunks are ignored. Memory allocations
    might fail even though there is plenty of room left in these prefixes of
    those other chunks.

    Fixes: 7f184275aa30 ("lib, Make gen_pool memory allocator lockless")
    Link: http://lkml.kernel.org/r/1477420604-28918-1-git-send-email-danielmentz@google.com
    Signed-off-by: Daniel Mentz
    Reviewed-by: Mathieu Desnoyers
    Acked-by: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Mentz
     

23 Dec, 2015

3 commits


05 Sep, 2015

2 commits

  • This change fills devm_gen_pool_create()/gen_pool_get() "name" argument
    stub with contents and extends of_gen_pool_get() functionality on this
    basis.

    If there is no associated platform device with a device node passed to
    of_gen_pool_get(), the function attempts to get a label property or device
    node name (= repeats MTD OF partition standard) and seeks for a named
    gen_pool registered by device of the parent device node.

    The main idea of the change is to allow registration of independent
    gen_pools under the same umbrella device, say "partitions" on "storage
    device", the original functionality of one "partition" per "storage
    device" is untouched.

    [akpm@linux-foundation.org: fix constness in devres_find()]
    [dan.carpenter@oracle.com: freeing const data pointers]
    Signed-off-by: Vladimir Zapolskiy
    Cc: Philipp Zabel
    Cc: Greg Kroah-Hartman
    Cc: Russell King
    Cc: Nicolas Ferre
    Cc: Alexandre Belloni
    Cc: Jean-Christophe Plagniol-Villard
    Cc: Shawn Guo
    Cc: Sascha Hauer
    Cc: Mauro Carvalho Chehab
    Cc: Arnd Bergmann
    Signed-off-by: Dan Carpenter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Zapolskiy
     
  • This change modifies gen_pool_get() and devm_gen_pool_create() client
    interfaces adding one more argument "name" of a gen_pool object.

    Due to implementation gen_pool_get() is capable to retrieve only one
    gen_pool associated with a device even if multiple gen_pools are created,
    fortunately right at the moment it is sufficient for the clients, hence
    provide NULL as a valid argument on both producer devm_gen_pool_create()
    and consumer gen_pool_get() sides.

    Because only one created gen_pool per device is addressable, explicitly
    add a restriction to devm_gen_pool_create() to create only one gen_pool
    per device, this implies two possible error codes returned by the
    function, account it on client side (only misc/sram). This completes
    client side changes related to genalloc updates.

    [akpm@linux-foundation.org: gen_pool_get() cleanup]
    Signed-off-by: Vladimir Zapolskiy
    Cc: Philipp Zabel
    Cc: Greg Kroah-Hartman
    Cc: Russell King
    Cc: Nicolas Ferre
    Cc: Alexandre Belloni
    Cc: Jean-Christophe Plagniol-Villard
    Cc: Shawn Guo
    Cc: Sascha Hauer
    Cc: Mauro Carvalho Chehab
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Zapolskiy
     

01 Jul, 2015

2 commits

  • To be consistent with other kernel interface namings, rename
    of_get_named_gen_pool() to of_gen_pool_get(). In the original function
    name "_named" suffix references to a device tree property, which contains
    a phandle to a device and the corresponding device driver is assumed to
    register a gen_pool object.

    Due to a weak relation and to avoid any confusion (e.g. in future
    possible scenario if gen_pool objects are named) the suffix is removed.

    [sfr@canb.auug.org.au: crypto/marvell/cesa - fix up for of_get_named_gen_pool() rename]
    Signed-off-by: Vladimir Zapolskiy
    Cc: Nicolas Ferre
    Cc: Philipp Zabel
    Cc: Shawn Guo
    Cc: Sascha Hauer
    Cc: Alexandre Belloni
    Cc: Russell King
    Cc: Mauro Carvalho Chehab
    Cc: Vinod Koul
    Cc: Takashi Iwai
    Cc: Jaroslav Kysela
    Signed-off-by: Stephen Rothwell
    Cc: Herbert Xu
    Cc: Boris BREZILLON
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Zapolskiy
     
  • To be consistent with other genalloc interface namings, rename
    dev_get_gen_pool() to gen_pool_get(). The original omitted "dev_" prefix
    is removed, since it points to argument type of the function, and so it
    does not bring any useful information.

    [akpm@linux-foundation.org: update arch/arm/mach-socfpga/pm.c]
    Signed-off-by: Vladimir Zapolskiy
    Acked-by: Nicolas Ferre
    Cc: Philipp Zabel
    Cc: Shawn Guo
    Cc: Sascha Hauer
    Cc: Alexandre Belloni
    Cc: Russell King
    Cc: Mauro Carvalho Chehab
    Cc: Vinod Koul
    Cc: Takashi Iwai
    Cc: Jaroslav Kysela
    Cc: Mark Brown
    Cc: Nicolas Ferre
    Cc: Alan Tull
    Cc: Dinh Nguyen
    Cc: Kevin Hilman
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Zapolskiy
     

14 Feb, 2015

1 commit

  • devm_gen_pool_create() calls devres_alloc() and dereferences its result
    without checking whether devres_alloc() succeeded. Check for error and
    bail out if it happened.

    Coverity-id 1016493.

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

13 Feb, 2015

2 commits


04 Dec, 2014

1 commit


10 Oct, 2014

2 commits

  • After allocating an address from a particular genpool, there is no good
    way to verify if that address actually belongs to a genpool. Introduce
    addr_in_gen_pool which will return if an address plus size falls
    completely within the genpool range.

    Signed-off-by: Laura Abbott
    Acked-by: Will Deacon
    Reviewed-by: Olof Johansson
    Reviewed-by: Catalin Marinas
    Cc: Arnd Bergmann
    Cc: David Riley
    Cc: Ritesh Harjain
    Cc: Russell King
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     
  • One of the more common algorithms used for allocation is to align the
    start address of the allocation to the order of size requested. Add this
    as an algorithm option for genalloc.

    Signed-off-by: Laura Abbott
    Acked-by: Will Deacon
    Acked-by: Olof Johansson
    Reviewed-by: Catalin Marinas
    Cc: Arnd Bergmann
    Cc: David Riley
    Cc: Ritesh Harjain
    Cc: Russell King
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     

26 Sep, 2014

1 commit


30 Jan, 2014

1 commit

  • In the gen_pool_dma_alloc() the dma pointer can be NULL and while
    assigning gen_pool_virt_to_phys(pool, vaddr) to dma caused the following
    crash on da850 evm:

    Unable to handle kernel NULL pointer dereference at virtual address 00000000
    Internal error: Oops: 805 [#1] PREEMPT ARM
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper Tainted: G W 3.13.0-rc1-00001-g0609e45-dirty #5
    task: c4830000 ti: c4832000 task.ti: c4832000
    PC is at gen_pool_dma_alloc+0x30/0x3c
    LR is at gen_pool_virt_to_phys+0x74/0x80
    Process swapper, call trace:
    gen_pool_dma_alloc+0x30/0x3c
    davinci_pm_probe+0x40/0xa8
    platform_drv_probe+0x1c/0x4c
    driver_probe_device+0x98/0x22c
    __driver_attach+0x8c/0x90
    bus_for_each_dev+0x6c/0x8c
    bus_add_driver+0x124/0x1d4
    driver_register+0x78/0xf8
    platform_driver_probe+0x20/0xa4
    davinci_init_late+0xc/0x14
    init_machine_late+0x1c/0x28
    do_one_initcall+0x34/0x15c
    kernel_init_freeable+0xe4/0x1ac
    kernel_init+0x8/0xec

    This patch fixes the above.

    [akpm@linux-foundation.org: update kerneldoc]
    Signed-off-by: Lad, Prabhakar
    Cc: Philipp Zabel
    Cc: Nicolin Chen
    Cc: Joe Perches
    Cc: Sachin Kamat
    Cc: [3.13.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lad, Prabhakar
     

13 Nov, 2013

1 commit

  • When using pool space for DMA buffer, there might be duplicated calling of
    gen_pool_alloc() and gen_pool_virt_to_phys() in each implementation.

    Thus it's better to add a simple helper function, a compatible one to the
    common dma_alloc_coherent(), to save some code.

    Signed-off-by: Nicolin Chen
    Cc: "Hans J. Koch"
    Cc: Dan Williams
    Cc: Eric Miao
    Cc: Grant Likely
    Cc: Greg Kroah-Hartman
    Cc: Haojian Zhuang
    Cc: Jaroslav Kysela
    Cc: Kevin Hilman
    Cc: Liam Girdwood
    Cc: Mark Brown
    Cc: Mauro Carvalho Chehab
    Cc: Rob Herring
    Cc: Russell King
    Cc: Sekhar Nori
    Cc: Takashi Iwai
    Cc: Vinod Koul
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicolin Chen
     

12 Sep, 2013

3 commits

  • The documentation mentions a "name" parameter, which does not exist. This
    commit removes such mention from the function documentation.

    Signed-off-by: Emilio López
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Emilio López
     
  • Use the helper function instead of __GFP_ZERO.

    Signed-off-by: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • In struct gen_pool_chunk, end_addr means the end address of memory chunk
    (inclusive), but in the implementation it is treated as address + size of
    memory chunk (exclusive), so it points to the address plus one instead of
    correct ending address.

    The ending address of memory chunk plus one will cause overflow on the
    memory chunk including the last address of memory map, e.g. when starting
    address is 0xFFF00000 and size is 0x100000 on 32bit machine, ending
    address will be 0x100000000.

    Use correct ending address like starting address + size - 1.

    [akpm@linux-foundation.org: add comment to struct gen_pool_chunk:end_addr]
    Signed-off-by: Joonyoung Shim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonyoung Shim
     

30 Apr, 2013

1 commit

  • This patch adds three exported functions to lib/genalloc.c:
    devm_gen_pool_create, dev_get_gen_pool, and of_get_named_gen_pool.

    devm_gen_pool_create is a managed version of gen_pool_create that keeps
    track of the pool via devres and allows the management code to
    automatically destroy it after device removal.

    dev_get_gen_pool retrieves the gen_pool for a given device, if it was
    created with devm_gen_pool_create, using devres_find.

    of_get_named_gen_pool retrieves the gen_pool for a given device node and
    property name, where the property must contain a phandle pointing to a
    platform device node. The corresponding platform device is then fed into
    dev_get_gen_pool and the resulting gen_pool is returned.

    [akpm@linux-foundation.org: make the of_get_named_gen_pool() stub static, fixing a zillion link errors]
    [akpm@linux-foundation.org: squish "struct device declared inside parameter list" warning]
    Signed-off-by: Philipp Zabel
    Acked-by: Grant Likely
    Tested-by: Michal Simek
    Cc: Fabio Estevam
    Cc: Matt Porter
    Cc: Dong Aisheng
    Cc: Greg Kroah-Hartman
    Cc: Rob Herring
    Cc: Paul Gortmaker
    Cc: Javier Martin
    Cc: Huang Shijie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Zabel
     

26 Oct, 2012

1 commit

  • The genalloc code uses the bitmap API from include/linux/bitmap.h and
    lib/bitmap.c, which is based on long values. Both bitmap_set from
    lib/bitmap.c and bitmap_set_ll, which is the lockless version from
    genalloc.c, use BITMAP_LAST_WORD_MASK to set the first bits in a long in
    the bitmap.

    That one uses (1 << bits) - 1, 0b111, if you are setting the first three
    bits. This means that the API counts from the least significant bits
    (LSB from now on) to the MSB. The LSB in the first long is bit 0, then.
    The same works for the lookup functions.

    The genalloc code uses longs for the bitmap, as it should. In
    include/linux/genalloc.h, struct gen_pool_chunk has unsigned long
    bits[0] as its last member. When allocating the struct, genalloc should
    reserve enough space for the bitmap. This should be a proper number of
    longs that can fit the amount of bits in the bitmap.

    However, genalloc allocates an integer number of bytes that fit the
    amount of bits, but may not be an integer amount of longs. 9 bytes, for
    example, could be allocated for 70 bits.

    This is a problem in itself if the Least Significat Bit in a long is in
    the byte with the largest address, which happens in Big Endian machines.
    This means genalloc is not allocating the byte in which it will try to
    set or check for a bit.

    This may end up in memory corruption, where genalloc will try to set the
    bits it has not allocated. In fact, genalloc may not set these bits
    because it may find them already set, because they were not zeroed since
    they were not allocated. And that's what causes a BUG when
    gen_pool_destroy is called and check for any set bits.

    What really happens is that genalloc uses kmalloc_node with __GFP_ZERO
    on gen_pool_add_virt. With SLAB and SLUB, this means the whole slab
    will be cleared, not only the requested bytes. Since struct
    gen_pool_chunk has a size that is a multiple of 8, and slab sizes are
    multiples of 8, we get lucky and allocate and clear the right amount of
    bytes.

    Hower, this is not the case with SLOB or with older code that did memset
    after allocating instead of using __GFP_ZERO.

    So, a simple module as this (running 3.6.0), will cause a crash when
    rmmod'ed.

    [root@phantom-lp2 foo]# cat foo.c
    #include
    #include
    #include
    #include

    MODULE_LICENSE("GPL");
    MODULE_VERSION("0.1");

    static struct gen_pool *foo_pool;

    static __init int foo_init(void)
    {
    int ret;
    foo_pool = gen_pool_create(10, -1);
    if (!foo_pool)
    return -ENOMEM;
    ret = gen_pool_add(foo_pool, 0xa0000000, 32 << 10, -1);
    if (ret) {
    gen_pool_destroy(foo_pool);
    return ret;
    }
    return 0;
    }

    static __exit void foo_exit(void)
    {
    gen_pool_destroy(foo_pool);
    }

    module_init(foo_init);
    module_exit(foo_exit);
    [root@phantom-lp2 foo]# zcat /proc/config.gz | grep SLOB
    CONFIG_SLOB=y
    [root@phantom-lp2 foo]# insmod ./foo.ko
    [root@phantom-lp2 foo]# rmmod foo
    ------------[ cut here ]------------
    kernel BUG at lib/genalloc.c:243!
    cpu 0x4: Vector: 700 (Program Check) at [c0000000bb0e7960]
    pc: c0000000003cb50c: .gen_pool_destroy+0xac/0x110
    lr: c0000000003cb4fc: .gen_pool_destroy+0x9c/0x110
    sp: c0000000bb0e7be0
    msr: 8000000000029032
    current = 0xc0000000bb0e0000
    paca = 0xc000000006d30e00 softe: 0 irq_happened: 0x01
    pid = 13044, comm = rmmod
    kernel BUG at lib/genalloc.c:243!
    [c0000000bb0e7ca0] d000000004b00020 .foo_exit+0x20/0x38 [foo]
    [c0000000bb0e7d20] c0000000000dff98 .SyS_delete_module+0x1a8/0x290
    [c0000000bb0e7e30] c0000000000097d4 syscall_exit+0x0/0x94
    --- Exception: c00 (System Call) at 000000800753d1a0
    SP (fffd0b0e640) is in userspace

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Cc: Paul Gortmaker
    Cc: Benjamin Gaignard
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thadeu Lima de Souza Cascardo
     

06 Oct, 2012

1 commit

  • Premit use of another algorithm than the default first-fit one. For
    example a custom algorithm could be used to manage alignment requirements.

    As I can't predict all the possible requirements/needs for all allocation
    uses cases, I add a "free" field 'void *data' to pass any needed
    information to the allocation function. For example 'data' could be used
    to handle a structure where you store the alignment, the expected memory
    bank, the requester device, or any information that could influence the
    allocation algorithm.

    An usage example may look like this:
    struct my_pool_constraints {
    int align;
    int bank;
    ...
    };

    unsigned long my_custom_algo(unsigned long *map, unsigned long size,
    unsigned long start, unsigned int nr, void *data)
    {
    struct my_pool_constraints *constraints = data;
    ...
    deal with allocation contraints
    ...
    return the index in bitmap where perform the allocation
    }

    void create_my_pool()
    {
    struct my_pool_constraints c;
    struct gen_pool *pool = gen_pool_create(...);
    gen_pool_add(pool, ...);
    gen_pool_set_algo(pool, my_custom_algo, &c);
    }

    Add of best-fit algorithm function:
    most of the time best-fit is slower then first-fit but memory fragmentation
    is lower. The random buffer allocation/free tests don't show any arithmetic
    relation between the allocation time and fragmentation but the
    best-fit algorithm
    is sometime able to perform the allocation when the first-fit can't.

    This new algorithm help to remove static allocations on ESRAM, a small but
    fast on-chip RAM of few KB, used for high-performance uses cases like DMA
    linked lists, graphic accelerators, encoders/decoders. On the Ux500
    (in the ARM tree) we have define 5 ESRAM banks of 128 KB each and use of
    static allocations becomes unmaintainable:
    cd arch/arm/mach-ux500 && grep -r ESRAM .
    ./include/mach/db8500-regs.h:/* Base address and bank offsets for ESRAM */
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BASE 0x40000000
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK_SIZE 0x00020000
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK0 U8500_ESRAM_BASE
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK1 (U8500_ESRAM_BASE + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK2 (U8500_ESRAM_BANK1 + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK3 (U8500_ESRAM_BANK2 + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK4 (U8500_ESRAM_BANK3 + U8500_ESRAM_BANK_SIZE)
    ./include/mach/db8500-regs.h:#define U8500_ESRAM_DMA_LCPA_OFFSET 0x10000
    ./include/mach/db8500-regs.h:#define U8500_DMA_LCPA_BASE
    (U8500_ESRAM_BANK0 + U8500_ESRAM_DMA_LCPA_OFFSET)
    ./include/mach/db8500-regs.h:#define U8500_DMA_LCLA_BASE U8500_ESRAM_BANK4

    I want to use genalloc to do dynamic allocations but I need to be able to
    fine tune the allocation algorithm. I my case best-fit algorithm give
    better results than first-fit, but it will not be true for every use case.

    Signed-off-by: Benjamin Gaignard
    Cc: Huang Ying
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Gaignard
     

08 Mar, 2012

1 commit


03 Aug, 2011

1 commit

  • This version of the gen_pool memory allocator supports lockless
    operation.

    This makes it safe to use in NMI handlers and other special
    unblockable contexts that could otherwise deadlock on locks. This is
    implemented by using atomic operations and retries on any conflicts.
    The disadvantage is that there may be livelocks in extreme cases. For
    better scalability, one gen_pool allocator can be used for each CPU.

    The lockless operation only works if there is enough memory available.
    If new memory is added to the pool a lock has to be still taken. So
    any user relying on locklessness has to ensure that sufficient memory
    is preallocated.

    The basic atomic operation of this allocator is cmpxchg on long. On
    architectures that don't have NMI-safe cmpxchg implementation, the
    allocator can NOT be used in NMI handler. So code uses the allocator
    in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG.

    Signed-off-by: Huang Ying
    Reviewed-by: Andi Kleen
    Reviewed-by: Mathieu Desnoyers
    Cc: Andrew Morton
    Signed-off-by: Len Brown

    Huang Ying
     

25 May, 2011

1 commit


30 Jun, 2010

1 commit

  • bitmap_find_next_zero_area requires the size of the bitmap, we instead
    passed the last suitable position. This made it impossible to allocate
    from the end of the pool.

    Fixes a regression introduced by 243797f59b748f679ab88d456fcc4f92236d724b
    ("genalloc: use bitmap_find_next_zero_area").

    Signed-off-by: Imre Deak
    Cc: Zygo Blaxell
    Cc: Tejun Heo
    Acked-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Imre Deak