13 Jan, 2012

1 commit

  • There exists at least one NVIDIA GPU (Quadro NVS 300) that has a DMS-59
    connector which is capable of supporting DisplayPort, TMDS and VGA on
    a single connector.

    We need to bump the allowed encoder limit to support all three configs.

    Signed-off-by: Ben Skeggs
    Signed-off-by: Dave Airlie

    Ben Skeggs
     

11 Jan, 2012

1 commit

  • * 'drm-core-next' of git://people.freedesktop.org/~airlied/linux: (307 commits)
    drm/nouveau/pm: fix build with HWMON off
    gma500: silence gcc warnings in mid_get_vbt_data()
    drm/ttm: fix condition (and vs or)
    drm/radeon: double lock typo in radeon_vm_bo_rmv()
    drm/radeon: use after free in radeon_vm_bo_add()
    drm/sis|via: don't return stack garbage from free_mem ioctl
    drm/radeon/kms: remove pointless CS flags priority struct
    drm/radeon/kms: check if vm is supported in VA ioctl
    drm: introduce drm_can_sleep and use in intel/radeon drivers. (v2)
    radeon: Fix disabling PCI bus mastering on big endian hosts.
    ttm: fix agp since ttm tt rework
    agp: Fix multi-line warning message whitespace
    drm/ttm/dma: Fix accounting error when calling ttm_mem_global_free_page and don't try to free freed pages.
    drm/ttm/dma: Only call set_pages_array_wb when the page is not in WB pool.
    drm/radeon/kms: sync across multiple rings when doing bo moves v3
    drm/radeon/kms: Add support for multi-ring sync in CS ioctl (v2)
    drm/radeon: GPU virtual memory support v22
    drm: make DRM_UNLOCKED ioctls with their own mutex
    drm: no need to hold global mutex for static data
    drm/radeon/benchmark: common modes sweep ignores 640x480@32
    ...

    Fix up trivial conflicts in radeon/evergreen.c and vmwgfx/vmwgfx_kms.c

    Linus Torvalds
     

09 Jan, 2012

2 commits

  • Signed-off-by: Alex Deucher
    Cc: Christian König
    Signed-off-by: Dave Airlie

    Alex Deucher
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (53 commits)
    Kconfig: acpi: Fix typo in comment.
    misc latin1 to utf8 conversions
    devres: Fix a typo in devm_kfree comment
    btrfs: free-space-cache.c: remove extra semicolon.
    fat: Spelling s/obsolate/obsolete/g
    SCSI, pmcraid: Fix spelling error in a pmcraid_err() call
    tools/power turbostat: update fields in manpage
    mac80211: drop spelling fix
    types.h: fix comment spelling for 'architectures'
    typo fixes: aera -> area, exntension -> extension
    devices.txt: Fix typo of 'VMware'.
    sis900: Fix enum typo 'sis900_rx_bufer_status'
    decompress_bunzip2: remove invalid vi modeline
    treewide: Fix comment and string typo 'bufer'
    hyper-v: Update MAINTAINERS
    treewide: Fix typos in various parts of the kernel, and fix some comments.
    clockevents: drop unknown Kconfig symbol GENERIC_CLOCKEVENTS_MIGR
    gpio: Kconfig: drop unknown symbol 'CS5535_GPIO'
    leds: Kconfig: Fix typo 'D2NET_V2'
    sound: Kconfig: drop unknown symbol ARCH_CLPS7500
    ...

    Fix up trivial conflicts in arch/powerpc/platforms/40x/Kconfig (some new
    kconfig additions, close to removed commented-out old ones)

    Linus Torvalds
     

06 Jan, 2012

4 commits

  • So we have a few places where the drm drivers would like to sleep to
    be nice to the system, mainly in the modesetting paths, but we also
    have two cases were atomic modesetting must take place, panic writing
    and kernel debugger. So provide a central inline to determine if a
    sleep or delay should be used and use this in the intel and radeon drivers.

    v2: drop intel_drv.h MSLEEP macro, nobody uses it.

    Based on patch from Michel Dänzer

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43941

    Reviewed-by: Daniel Vetter
    Signed-off-by: Dave Airlie

    Dave Airlie
     
  • ttm tt rework modified the way we allocate and populate the
    ttm_tt structure, the AGP side was missing some bit to properly
    work. Fix those and fix radeon and nouveau AGP support.

    Tested on radeon only so far.

    Signed-off-by: Jerome Glisse
    Reviewed-by: Konrad Rzeszutek Wilk
    Signed-off-by: Dave Airlie

    Jerome Glisse
     
  • Use semaphores to sync buffers across rings in the CS
    ioctl. Add a reloc flag to allow userspace to skip
    sync for buffers.

    agd5f: port to latest CS ioctl changes.

    v2: add ring lock/unlock to make sure changes hit the ring.

    Signed-off-by: Christian König
    Signed-off-by: Alex Deucher
    Signed-off-by: Dave Airlie

    Christian König
     
  • Virtual address space are per drm client (opener of /dev/drm).
    Client are in charge of virtual address space, they need to
    map bo into it by calling DRM_RADEON_GEM_VA ioctl.

    First 16M of virtual address space is reserved by the kernel.

    Once using 2 level page table we should be able to have a small
    vram memory footprint for each pt (there would be one pt for all
    gart, one for all vram and then one first level for each virtual
    address space).

    Plan include using the sub allocator for a common vm page table
    area and using memcpy to copy vm page table in & out. Or use
    a gart object and copy things in & out using dma.

    v2: agd5f fixes:
    - Add vram base offset for vram pages. The GPU physical address of a
    vram page is FB_OFFSET + page offset. FB_OFFSET is 0 on discrete
    cards and the physical bus address of the stolen memory on
    integrated chips.
    - VM_CONTEXT1_PROTECTION_FAULT_DEFAULT_ADDR covers all vmid's >= 1

    v3: agd5f:
    - integrate with the semaphore/multi-ring stuff

    v4:
    - rebase on top ttm dma & multi-ring stuff
    - userspace is now in charge of the address space
    - no more specific cs vm ioctl, instead cs ioctl has a new
    chunk

    v5:
    - properly handle mem == NULL case from move_notify callback
    - fix the vm cleanup path

    v6:
    - fix update of page table to only happen on valid mem placement

    v7:
    - add tlb flush for each vm context
    - add flags to define mapping property (readable, writeable, snooped)
    - make ring id implicit from ib->fence->ring, up to each asic callback
    to then do ring specific scheduling if vm ib scheduling function

    v8:
    - add query for ib limit and kernel reserved virtual space
    - rename vm->size to max_pfn (maximum number of page)
    - update gem_va ioctl to also allow unmap operation
    - bump kernel version to allow userspace to query for vm support

    v9:
    - rebuild page table only when bind and incrementaly depending
    on bo referenced by cs and that have been moved
    - allow virtual address space to grow
    - use sa allocator for vram page table
    - return invalid when querying vm limit on non cayman GPU
    - dump vm fault register on lockup

    v10: agd5f:
    - Move the vm schedule_ib callback to a standalone function, remove
    the callback and use the existing ib_execute callback for VM IBs.

    v11:
    - rebase on top of lastest Linus

    v12: agd5f:
    - remove spurious backslash
    - set IB vm_id to 0 in radeon_ib_get()

    v13: agd5f:
    - fix handling of RADEON_CHUNK_ID_FLAGS

    v14:
    - fix va destruction
    - fix suspend resume
    - forbid bo to have several different va in same vm

    v15:
    - rebase

    v16:
    - cleanup left over of vm init/fini

    v17: agd5f:
    - cs checker

    v18: agd5f:
    - reworks the CS ioctl to better support multiple rings and
    VM. Rather than adding a new chunk id for VM, just re-use the
    IB chunk id and add a new flags for VM mode. Also define additional
    dwords for the flags chunk id to define the what ring we want to use
    (gfx, compute, uvd, etc.) and the priority.

    v19:
    - fix cs fini in weird case of no ib
    - semi working flush fix for ni
    - rebase on top of sa allocator changes

    v20: agd5f:
    - further CS ioctl cleanups from Christian's comments

    v21: agd5f:
    - integrate CS checker improvements

    v22: agd5f:
    - final cleanups for release, only allow VM CS on cayman

    Signed-off-by: Jerome Glisse
    Signed-off-by: Alex Deucher
    Signed-off-by: Dave Airlie

    Jerome Glisse
     

05 Jan, 2012

1 commit

  • In cases where the scanout hw is sufficiently similar between "overlay"
    and traditional crtc layers, it might be convenient to allow the driver
    to create internal drm_plane helper objects used by the drm_crtc
    implementation, rather than duplicate code between the plane and crtc.
    A private plane is not exposed to userspace.

    Signed-off-by: Rob Clark
    Signed-off-by: Dave Airlie

    Rob Clark
     

04 Jan, 2012

2 commits

  • These registers are automatically incremented by the hardware during
    transform feedback to track where the next streamed vertex output
    should go. Unlike the previous generation, which had a packet for
    setting the corresponding registers to a defined value, gen7 only has
    MI_LOAD_REGISTER_IMM to do so. That's a secure packet (since it loads
    an arbitrary register), so we need to do it from the kernel, and it
    needs to be settable atomically with the batchbuffer execution so that
    two clients doing transform feedback don't stomp on each others'
    state.

    Instead of building a more complicated interface involcing setting the
    registers to a specific value, just set them to 0 when asked and
    userland can tweak its pointers accordingly.

    Signed-off-by: Eric Anholt
    Reviewed-by: Eugeni Dodonov
    Reviewed-by: Kenneth Graunke
    Signed-off-by: Keith Packard

    Eric Anholt
     
  • Add new ioctls for getting and setting the current destination color
    key. This allows for simple overlay display control by matching a color
    key value in the primary plane before blending the overlay on top.

    v2: remove unnecessary mutex acquire/release around reg accesses
    v3: add support for full color key management
    v4: fix copy & paste bug in snb_get_colorkey
    don't bother checking min/max values against docs as the docs are likely
    wrong (how could we handle 10bpc surface formats?)

    Reviewed-by: Daniel Vetter
    Signed-off-by: Jesse Barnes

    Jesse Barnes
     

29 Dec, 2011

2 commits

  • This patch is hdmi display support for exynos drm driver.

    There is already v4l2 based exynos hdmi driver in drivers/media/video/s5p-tv
    and some low level code is already in s5p-tv and even headers for register
    define are almost same. but in this patch, we decide not to consider separated
    common code with s5p-tv.

    Exynos HDMI is composed of 5 blocks, mixer, vp, hdmi, hdmiphy and ddc.

    1. mixer. The piece of hardware responsible for mixing and blending multiple
    data inputs before passing it to an output device. The mixer is capable of
    handling up to three image layers. One is the output of VP. Other two are
    images in RGB format. The blending factor, and layers' priority are controlled
    by mixer's registers. The output is passed to HDMI.

    2. vp (video processor). It is used for processing of NV12/NV21 data. An image
    stored in RAM is accessed by DMA. The output in YCbCr444 format is send to
    mixer.

    3. hdmi. The piece of HW responsible for generation of HDMI packets. It takes
    pixel data from mixer and transforms it into data frames. The output is send
    to HDMIPHY interface.

    4. hdmiphy. Physical interface for HDMI. Its duties are sending HDMI packets to
    HDMI connector. Basically, it contains a PLL that produces source clock for
    mixer, vp and hdmi.

    5. ddc (display data channel). It is dedicated i2c channel to exchange display
    information as edid with display monitor.

    With plane support, exynos hdmi driver fully supports two mixer layes and vp
    layer. Also vp layer supports multi buffer plane pixel formats having non
    contigus memory spaces.

    In exynos drm driver, common drm_hdmi driver to interface with drm framework
    has opertion pointers for mixer and hdmi. this drm_hdmi driver is registered as
    sub driver of exynos_drm. hdmi has hdmiphy and ddc i2c clients and controls
    them. mixer controls all overlay layers in both mixer and vp.

    Vblank interrupts for hdmi are handled by mixer internally because drm
    framework cannot support multiple irq id. And pipe number is used to check
    which display device irq happens.

    History
    v2: this version
    - drm plane feature support to handle overlay layers.
    - multi buffer plane pixel format support for vp layer.
    - vp layer support

    RFCv1: original
    - at https://lkml.org/lkml/2011/11/4/164

    Signed-off-by: Seung-Woo Kim
    Signed-off-by: Inki Dae
    Signed-off-by: Joonyoung Shim
    Signed-off-by: Kyungmin Park

    Seung-Woo Kim
     
  • Multi buffer plane pixel format has seperated memory spaces for each
    plane. For example, NV12M has Y plane and CbCr plane and these are in
    non contiguous memory region. Compared with NV12, NV12M's memory shape
    is like following.
    NV12 : ______(Y)(CbCr)_______
    NV12M : __(Y)_ ..... _(CbCr)__

    Signed-off-by: Seung-Woo Kim
    Signed-off-by: Inki Dae
    Signed-off-by: Kyungmin Park

    Seung-Woo Kim
     

22 Dec, 2011

6 commits


21 Dec, 2011

2 commits


20 Dec, 2011

6 commits


14 Dec, 2011

1 commit


06 Dec, 2011

12 commits

  • Provide helper function to compute the kernel memory size needed
    for each buffer object. Move all the accounting inside ttm, simplifying
    driver and avoiding code duplication accross them.

    v2 fix accounting of ghost object, one would have thought that i
    would have run into the issue since a longtime but it seems
    ghost object are rare when you have plenty of vram ;)

    Signed-off-by: Jerome Glisse
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • Move dma data to a superset ttm_dma_tt structure which herit
    from ttm_tt. This allow driver that don't use dma functionalities
    to not have to waste memory for it.

    V2 Rebase on top of no memory account changes (where/when is my
    delorean when i need it ?)
    V3 Make sure page list is initialized empty
    V4 typo/syntax fixes

    Signed-off-by: Jerome Glisse
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • In TTM world the pages for the graphic drivers are kept in three different
    pools: write combined, uncached, and cached (write-back). When the pages
    are used by the graphic driver the graphic adapter via its built in MMU
    (or AGP) programs these pages in. The programming requires the virtual address
    (from the graphic adapter perspective) and the physical address (either System RAM
    or the memory on the card) which is obtained using the pci_map_* calls (which does the
    virtual to physical - or bus address translation). During the graphic application's
    "life" those pages can be shuffled around, swapped out to disk, moved from the
    VRAM to System RAM or vice-versa. This all works with the existing TTM pool code
    - except when we want to use the software IOTLB (SWIOTLB) code to "map" the physical
    addresses to the graphic adapter MMU. We end up programming the bounce buffer's
    physical address instead of the TTM pool memory's and get a non-worky driver.
    There are two solutions:
    1) using the DMA API to allocate pages that are screened by the DMA API, or
    2) using the pci_sync_* calls to copy the pages from the bounce-buffer and back.

    This patch fixes the issue by allocating pages using the DMA API. The second
    is a viable option - but it has performance drawbacks and potential correctness
    issues - think of the write cache page being bounced (SWIOTLB->TTM), the
    WC is set on the TTM page and the copy from SWIOTLB not making it to the TTM
    page until the page has been recycled in the pool (and used by another application).

    The bounce buffer does not get activated often - only in cases where we have
    a 32-bit capable card and we want to use a page that is allocated above the
    4GB limit. The bounce buffer offers the solution of copying the contents
    of that 4GB page to an location below 4GB and then back when the operation has been
    completed (or vice-versa). This is done by using the 'pci_sync_*' calls.
    Note: If you look carefully enough in the existing TTM page pool code you will
    notice the GFP_DMA32 flag is used - which should guarantee that the provided page
    is under 4GB. It certainly is the case, except this gets ignored in two cases:
    - If user specifies 'swiotlb=force' which bounces _every_ page.
    - If user is using a Xen's PV Linux guest (which uses the SWIOTLB and the
    underlaying PFN's aren't necessarily under 4GB).

    To not have this extra copying done the other option is to allocate the pages
    using the DMA API so that there is not need to map the page and perform the
    expensive 'pci_sync_*' calls.

    This DMA API capable TTM pool requires for this the 'struct device' to
    properly call the DMA API. It also has to track the virtual and bus address of
    the page being handed out in case it ends up being swapped out or de-allocated -
    to make sure it is de-allocated using the proper's 'struct device'.

    Implementation wise the code keeps two lists: one that is attached to the
    'struct device' (via the dev->dma_pools list) and a global one to be used when
    the 'struct device' is unavailable (think shrinker code). The global list can
    iterate over all of the 'struct device' and its associated dma_pool. The list
    in dev->dma_pools can only iterate the device's dma_pool.
    /[struct device_pool]\
    /---------------------------------------------------| dev |
    / +-------| dma_pool |
    /-----+------\ / \--------------------/
    |struct device| /-->[struct dma_pool for WC]</ /[struct device_pool]\
    | dma_pools +----+ /-| dev |
    | ... | \--->[struct dma_pool for uncached]
    [v1: Using swiotlb_nr_tbl instead of swiotlb_enabled]
    [v2: Major overhaul - added 'inuse_list' to seperate used from inuse and reorder
    the order of lists to get better performance.]
    [v3: Added comments/and some logic based on review, Added Jerome tag]
    [v4: rebase on top of ttm_tt & ttm_backend merge]
    [v5: rebase on top of ttm memory accounting overhaul]
    [v6: New rebase on top of more memory accouting changes]
    [v7: well rebase on top of no memory accounting changes]
    [v8: make sure pages list is initialized empty]
    [v9: calll ttm_mem_global_free_page in unpopulate for accurate accountg]
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Jerome Glisse
    Acked-by: Thomas Hellstrom

    Konrad Rzeszutek Wilk
     
  • Move the page allocation and freeing to driver callback and
    provide ttm code helper function for those.

    Most intrusive change, is the fact that we now only fully
    populate an object this simplify some of code designed around
    the page fault design.

    V2 Rebase on top of memory accounting overhaul
    V3 New rebase on top of more memory accouting changes
    V4 Rebase on top of no memory account changes (where/when is my
    delorean when i need it ?)

    Signed-off-by: Jerome Glisse
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • ttm_backend will only exist with a ttm_tt, and ttm_tt
    will only be of interest when bound to a backend. Merge them
    to avoid code and data duplication.

    V2 Rebase on top of memory accounting overhaul
    V3 Rebase on top of more memory accounting changes
    V4 Rebase on top of no memory account changes (where/when is my
    delorean when i need it ?)
    V5 make sure ttm is unbound before destroying, change commit
    message on suggestion from Tormod Volden

    Signed-off-by: Jerome Glisse
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • Use the ttm_tt pages array for pages allocations, move the list
    unwinding into the page allocation functions.

    Signed-off-by: Jerome Glisse

    Jerome Glisse
     
  • This field is not use by any of the driver just drop it.

    Signed-off-by: Jerome Glisse
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • Split btw highmem and lowmem page was rendered useless by the
    pool code. Remove it. Note further cleanup would change the
    ttm page allocation helper to actualy take an array instead
    of relying on list this could drasticly reduce the number of
    function call in the common case of allocation whole buffer.

    Signed-off-by: Jerome Glisse
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • This was never use in none of the driver, properly using userspace
    page for bo would need more code (vma interaction mostly). Removing
    this dead code in preparation of ttm_tt & backend merge.

    Signed-off-by: Jerome Glisse
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Thomas Hellstrom

    Jerome Glisse
     
  • Including a comment about what the locks are for.

    Signed-off-by: Jesse Barnes
    Reviewed-by: Alex Deucher
    Signed-off-by: Dave Airlie

    Jesse Barnes
     
  • This is actually a core structure with a big future ahead of it. Make
    it a little less mysterious.

    Signed-off-by: Jesse Barnes
    Signed-off-by: Dave Airlie

    Jesse Barnes
     
  • Just fix the wrapping mostly.

    Signed-off-by: Jesse Barnes
    Reviewed-by: Alex Deucher
    Signed-off-by: Dave Airlie

    Jesse Barnes