06 Dec, 2011

2 commits

  • Provide helper function to compute the kernel memory size needed
    for each buffer object. Move all the accounting inside ttm, simplifying
    driver and avoiding code duplication accross them.

    v2 fix accounting of ghost object, one would have thought that i
    would have run into the issue since a longtime but it seems
    ghost object are rare when you have plenty of vram ;)

    Signed-off-by: Jerome Glisse
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • This was never use in none of the driver, properly using userspace
    page for bo would need more code (vma interaction mostly). Removing
    this dead code in preparation of ttm_tt & backend merge.

    Signed-off-by: Jerome Glisse
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     

21 Jun, 2011

1 commit


08 Apr, 2011

1 commit


05 Apr, 2011

1 commit


31 Mar, 2011

1 commit


22 Nov, 2010

3 commits

  • This patch attempts to fix up shortcomings with the current calling
    sequences.

    1) There's a fastpath where no locking occurs and only io_mem_reserved is
    called to obtain needed info for mapping. The fastpath is set per
    memory type manager.
    2) If the fastpath is disabled, io_mem_reserve and io_mem_free will be exactly
    balanced and not called recursively for the same struct ttm_mem_reg.
    3) Optionally the driver can choose to enable a per memory type manager LRU
    eviction mechanism that, when io_mem_reserve returns -EAGAIN will attempt
    to kill user-space mappings of memory in that manager to free up needed
    resources

    Signed-off-by: Thomas Hellstrom
    Reviewed-by: Ben Skeggs
    Signed-off-by: Dave Airlie

    Thomas Hellstrom
     
  • The bo lock used only to protect the bo sync object members, and since it
    is a per bo lock, fencing a buffer list will see a lot of locks and unlocks.
    Replace it with a per-device lock that protects the sync object members on
    *all* bos. Reading and setting these members will always be very quick, so
    the risc of heavy lock contention is microscopic. Note that waiting for
    sync objects will always take place outside of this lock.

    The bo device fence lock will eventually be replaced with a seqlock /
    rcu mechanism so we can determine that a bo is idle under a
    rcu / read seqlock.

    However this change will allow us to batch fencing and unreserving of
    buffers with a minimal amount of locking.

    Signed-off-by: Thomas Hellstrom
    Reviewed-by: Jerome Glisse
    Signed-off-by: Dave Airlie

    Thomas Hellstrom
     
  • Makes it possible to reserve a list of buffer objects with a single
    spin lock / unlock if there is no contention.
    Should improve cpu usage on SMP kernels.

    v2: Initialize private list members on reserve and don't call
    ttm_bo_list_ref_sub() with zero put_count.

    Signed-off-by: Thomas Hellstrom
    Signed-off-by: Dave Airlie

    Dave Airlie
     

10 Nov, 2010

1 commit


19 Oct, 2010

1 commit


06 Oct, 2010

1 commit

  • This fixes a race pointed out by Dave Airlie where we don't take a buffer
    object about to be destroyed off the LRU lists properly. It also fixes a rare
    case where a buffer object could be destroyed in the middle of an
    accelerated eviction.

    The patch also adds a utility function that can be used to prematurely
    release GPU memory space usage of an object waiting to be destroyed.
    For example during eviction or swapout.

    The above mentioned commit didn't queue the buffer on the delayed destroy
    list under some rare circumstances. It also didn't completely honor the
    remove_all parameter.

    Fixes:
    https://bugzilla.redhat.com/show_bug.cgi?id=615505
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591061

    Signed-off-by: Thomas Hellstrom
    Signed-off-by: Dave Airlie

    Thomas Hellstrom
     

05 Oct, 2010

1 commit


18 May, 2010

1 commit


20 Apr, 2010

1 commit

  • On fault the driver is given the opportunity to perform any operation
    it sees fit in order to place the buffer into a CPU visible area of
    memory. This patch doesn't break TTM users, nouveau, vmwgfx and radeon
    should keep working properly. Future patch will take advantage of this
    infrastructure and remove the old path from TTM once driver are
    converted.

    V2 return VM_FAULT_NOPAGE if callback return -EBUSY or -ERESTARTSYS
    V3 balance io_mem_reserve and io_mem_free call, fault_reserve_notify
    is responsible to perform any necessary task for mapping to succeed
    V4 minor cleanup, atomic_t -> bool as member is protected by reserve
    mecanism from concurent access
    V5 the callback is now responsible for iomapping the bo and providing
    a virtual address this simplify TTM and will allow to get rid of
    TTM_MEMTYPE_FLAG_NEEDS_IOREMAP
    V6 use the bus addr data to decide to ioremap or this isn't needed
    but we don't necesarily need to ioremap in the callback but still
    allow driver to use static mapping

    Signed-off-by: Jerome Glisse
    Reviewed-by: Thomas Hellstrom
    Signed-off-by: Dave Airlie

    Jerome Glisse
     

08 Apr, 2010

1 commit

  • There is case where we want to be able to wait only for the
    GPU while not waiting for other buffer to be unreserved. This
    patch split the no_wait argument all the way down in the whole
    ttm path so that upper level can decide on what to wait on or
    not.

    [airlied: squashed these 4 for bisectability reasons.]
    drm/radeon/kms: update to TTM no_wait splitted argument
    drm/nouveau: update to TTM no_wait splitted argument
    drm/vmwgfx: update to TTM no_wait splitted argument
    [vmwgfx patch: Reviewed-by: Thomas Hellstrom ]

    Signed-off-by: Jerome Glisse
    Acked-by: Thomas Hellstrom
    Signed-off-by: Dave Airlie

    Jerome Glisse
     

11 Dec, 2009

1 commit

  • Convert ttm_buffer_object_init to use struct ttm_placement and
    rename to ttm_bo_init for consistency with function naming. This
    allow to give more complex placement at buffer creation. For
    instance you ask to allocate bo into vram first but if there is
    not enough vram you can give system as a second possible
    placement. It also allow to create buffer in a specific range.

    Also rename ttm_buffer_object_validate to ttm_bo_validate.

    Signed-off-by: Jerome Glisse
    Signed-off-by: Dave Airlie

    Jerome Glisse
     

10 Dec, 2009

2 commits

  • Return -ERESTARTSYS instead of -ERESTART when interrupted by a signal.
    The -ERESTARTSYS is converted to an -EINTR by the kernel signal layer
    before returned to user-space.

    Signed-off-by: Thomas Hellstrom
    Signed-off-by: Jerome Glisse
    Signed-off-by: Dave Airlie

    Thomas Hellstrom
     
  • This change allow driver to pass sorted memory placement,
    from most prefered placement to least prefered placement.
    In order to avoid long function prototype a structure is
    used to gather memory placement informations such as range
    restriction (if you need a buffer to be in given range).
    Range restriction is determined by fpfn & lpfn which are
    the first page and last page number btw which allocation
    can happen. If those fields are set to 0 ttm will assume
    buffer can be put anywhere in the address space (thus it
    avoids putting a burden on the driver to always properly
    set those fields).

    This patch also factor few functions like evicting first
    entry of lru list or getting a memory space. This avoid
    code duplication.

    V2: Change API to use placement flags and array instead
    of packing placement order into a quadword.
    V3: Make sure we set the appropriate mem.placement flag
    when validating or allocation memory space.

    [Pending Thomas Hellstrom further review but okay
    from preliminary review so far].

    Signed-off-by: Jerome Glisse
    Signed-off-by: Dave Airlie

    Jerome Glisse
     

19 Aug, 2009

2 commits

  • Common resources, like memory accounting and swap lists should be
    global and not per device. Introduce a struct ttm_bo_global to
    accomodate this, and register it with sysfs. Add a small sysfs interface
    to return the number of active buffer objects.

    Signed-off-by: Thomas Hellstrom
    Signed-off-by: Dave Airlie

    Thomas Hellstrom
     
  • A micro-optimization on the function ttm_kmap_obj_virtual().

    By defining the values of enum ttm_bo_kmap_obj::bo_kmap_type to have a
    bit indicating iomem, size of the function ttm_kmap_obj_virtual() will be
    reduced by 16 bytes on x86_64 (gcc 4.1.2).

    ttm_kmap_obj_virtual() may be heavily used, when buffer objects are
    accessed via wrappers, that work for both kinds of memory addresses:
    iomem cookies and kernel virtual.

    Signed-off-by: Pekka Paalanen
    Signed-off-by: Dave Airlie

    Pekka Paalanen
     

15 Jun, 2009

1 commit

  • TTM is a GPU memory manager subsystem designed for use with GPU
    devices with various memory types (On-card VRAM, AGP,
    PCI apertures etc.). It's essentially a helper library that assists
    the DRM driver in creating and managing persistent buffer objects.

    TTM manages placement of data and CPU map setup and teardown on
    data movement. It can also optionally manage synchronization of
    data on a per-buffer-object level.

    TTM takes care to provide an always valid virtual user-space address
    to a buffer object which makes user-space sub-allocation of
    big buffer objects feasible.

    TTM uses a fine-grained per buffer-object locking scheme, taking
    care to release all relevant locks when waiting for the GPU.
    Although this implies some locking overhead, it's probably a big
    win for devices with multiple command submission mechanisms, since
    the lock contention will be minimal.

    TTM can be used with whatever user-space interface the driver
    chooses, including GEM. It's used by the upcoming Radeon KMS DRM driver
    and is also the GPU memory management core of various new experimental
    DRM drivers.

    Signed-off-by: Thomas Hellstrom
    Signed-off-by: Jerome Glisse
    Signed-off-by: Dave Airlie

    Thomas Hellstrom