10 Aug, 2016

1 commit

  • Recent versions of gcc say this:

    include/drm/i915_drm.h:96:34: warning: result of ‘65535 << 20’
    requires 37 bits to represent, but ‘int’ only has 32 bits
    [-Wshift-overflow=]

    Reported-by: David Binderman
    Signed-off-by: Dave Gordon
    Cc: Dave Airlie
    Reviewed-by: Chris Wilson
    Signed-off-by: Daniel Vetter
    Link: http://patchwork.freedesktop.org/patch/msgid/1470764110-23855-1-git-send-email-david.s.gordon@intel.com

    Dave Gordon
     

25 Apr, 2016

1 commit

  • Move the better constructs/comments from i915_gem_stolen.c to
    early-quirks.c and increase readability in preparation of only
    having one set of functions.

    - intel_stolen_base -> gen3_stolen_base
    - use phys_addr_t instead of u32 for address for future proofing

    v2:
    - Print the invalid register values (Chris)
    (Omitting the register prefix as it's visible from backtrace.)

    Cc: Chris Wilson
    Cc: Mika Kuoppala
    Cc: Ville Syrjälä
    Cc: Tvrtko Ursulin
    Acked-by: Tvrtko Ursulin
    Reviewed-by: Chris Wilson
    Signed-off-by: Joonas Lahtinen

    Joonas Lahtinen
     

09 Feb, 2014

1 commit

  • There isn't an explicit stolen memory base register on gen2.
    Some old comment in the i915 code suggests we should get it via
    max_low_pfn_mapped, but that's clearly a bad idea on my MGM.

    The e820 map in said machine looks like this:

    BIOS-e820: [mem 0x0000000000000000-0x000000000009f7ff] usable
    BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved
    BIOS-e820: [mem 0x00000000000ce000-0x00000000000cffff] reserved
    BIOS-e820: [mem 0x00000000000dc000-0x00000000000fffff] reserved
    BIOS-e820: [mem 0x0000000000100000-0x000000001f6effff] usable
    BIOS-e820: [mem 0x000000001f6f0000-0x000000001f6f7fff] ACPI data
    BIOS-e820: [mem 0x000000001f6f8000-0x000000001f6fffff] ACPI NVS
    BIOS-e820: [mem 0x000000001f700000-0x000000001fffffff] reserved
    BIOS-e820: [mem 0x00000000fec10000-0x00000000fec1ffff] reserved
    BIOS-e820: [mem 0x00000000ffb00000-0x00000000ffbfffff] reserved
    BIOS-e820: [mem 0x00000000fff00000-0x00000000ffffffff] reserved

    That makes max_low_pfn_mapped = 1f6f0000, so assuming our stolen
    memory would start there would place it on top of some ACPI
    memory regions. So not a good idea as already stated.

    The 9MB region after the ACPI regions at 0x1f700000 however
    looks promising given that the macine reports the stolen memory
    size to be 8MB. Looking at the PGTBL_CTL register, the GTT
    entries are at offset 0x1fee00000, and given that the GTT
    entries occupy 128KB, it looks like the stolen memory could
    start at 0x1f700000 and the GTT entries would occupy the last
    128KB of the stolen memory.

    After some more digging through chipset documentation, I've
    determined the BIOS first allocates space for something called
    TSEG (something to do with SMM) from the top of memory, and then
    it allocates the graphics stolen memory below that. Accordind to
    the chipset documentation TSEG has a fixed size of 1MB on 855.
    So that explains the top 1MB in the e820 region. And it also
    confirms that the GTT entries are in fact at the end of the the
    stolen memory region.

    Derive the stolen memory base address on gen2 the same as the
    BIOS does (TOM-TSEG_SIZE-stolen_size). There are a few
    differences between the registers on various gen2 chipsets, so a
    few different codepaths are required.

    865G is again bit more special since it seems to support enough
    memory to hit 4GB address space issues. This means the PCI
    allocations will also affect the location of the stolen memory.
    Fortunately there appears to be the TOUD register which may give
    us the correct answer directly. But the chipset docs are a bit
    unclear, so I'm not 100% sure that the graphics stolen memory is
    always the last thing the BIOS steals. Someone would need to
    verify it on a real system.

    I tested this on the my 830 and 855 machines, and so far
    everything looks peachy.

    Signed-off-by: Ville Syrjälä
    Cc: Bjorn Helgaas
    Link: http://lkml.kernel.org/r/1391628540-23072-3-git-send-email-ville.syrjala@linux.intel.com
    Signed-off-by: Ingo Molnar

    Ville Syrjälä
     

09 Nov, 2013

1 commit

  • All the BARs have the ability to grow.

    v2: Pulled out the simulator workaround to a separate patch.
    Rebased.

    v3: Rebase onto latest vlv patches from Jesse.

    v4: Rebased on top of the early stolen quirk patch from Jesse.

    v5: Use the new macro names.
    s/INTEL_BDW_PCI_IDS_D/INTEL_BDW_D_IDS
    s/INTEL_BDW_PCI_IDS_M/INTEL_BDW_M_IDS
    It's Jesse's fault for not following the convention I originally set.

    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Jesse Barnes
    Signed-off-by: Ben Widawsky
    Signed-off-by: Daniel Vetter

    Ben Widawsky
     

04 Sep, 2013

2 commits

  • Systems with Intel graphics controllers set aside memory exclusively for
    gfx driver use. This memory is not always marked in the E820 as
    reserved or as RAM, and so is subject to overlap from E820 manipulation
    later in the boot process. On some systems, MMIO space is allocated on
    top, despite the efforts of the "RAM buffer" approach, which simply
    rounds memory boundaries up to 64M to try to catch space that may decode
    as RAM and so is not suitable for MMIO.

    v2: use read_pci_config for 32 bit reads instead of adding a new one
    (Chris)
    add gen6 stolen size function (Chris)
    v3: use a function pointer (Chris)
    drop gen2 bits (Daniel)
    v4: call e820_sanitize_map after adding the region
    v5: fixup comments (Peter)
    simplify loop (Chris)

    Acked-by: Ingo Molnar
    Signed-off-by: Jesse Barnes
    Acked-by: H. Peter Anvin
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66726
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66844
    Signed-off-by: Daniel Vetter

    Jesse Barnes
     
  • For use by userspace (at some point in the future) and other kernel code.

    v2: move PCI IDs to uabi (Chris)
    move PCI IDs to drm/ (Dave)
    v3: fixup Quanta detection - needs to come first (Daniel)
    v4: fix up PCI match structure init for easier use by userspace (Chris)

    Signed-off-by: Jesse Barnes
    Signed-off-by: Daniel Vetter

    Jesse Barnes
     

05 Oct, 2012

1 commit


04 Oct, 2012

1 commit

  • Pull drm merge (part 1) from Dave Airlie:
    "So first of all my tree and uapi stuff has a conflict mess, its my
    fault as the nouveau stuff didn't hit -next as were trying to rebase
    regressions out of it before we merged.

    Highlights:
    - SH mobile modesetting driver and associated helpers
    - some DRM core documentation
    - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
    combined pte writing, ilk rc6 support,
    - nouveau: major driver rework into a hw core driver, makes features
    like SLI a lot saner to implement,
    - psb: add eDP/DP support for Cedarview
    - radeon: 2 layer page tables, async VM pte updates, better PLL
    selection for > 2 screens, better ACPI interactions

    The rest is general grab bag of fixes.

    So why part 1? well I have the exynos pull req which came in a bit
    late but was waiting for me to do something they shouldn't have and it
    looks fairly safe, and David Howells has some more header cleanups
    he'd like me to pull, that seem like a good idea, but I'd like to get
    this merge out of the way so -next dosen't get blocked."

    Tons of conflicts mostly due to silly include line changes, but mostly
    mindless. A few other small semantic conflicts too, noted from Dave's
    pre-merged branch.

    * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
    drm/nv98/crypt: fix fuc build with latest envyas
    drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
    drm/nv41/vm: fix and enable use of "real" pciegart
    drm/nv44/vm: fix and enable use of "real" pciegart
    drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
    drm/nouveau: store supported dma mask in vmmgr
    drm/nvc0/ibus: initial implementation of subdev
    drm/nouveau/therm: add support for fan-control modes
    drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
    drm/nouveau/therm: calculate the pwm divisor on nv50+
    drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
    drm/nouveau/therm: move thermal-related functions to the therm subdev
    drm/nouveau/bios: parse the pwm divisor from the perf table
    drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
    drm/nouveau/therm: rework thermal table parsing
    drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
    drm/nouveau: fix pm initialization order
    drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
    drm/nouveau: log channel debug/error messages from client object rather than drm client
    drm/nouveau: have drm debugging macros build on top of core macros
    ...

    Linus Torvalds
     

03 Oct, 2012

1 commit


26 Sep, 2012

1 commit


20 Sep, 2012

1 commit

  • There are internal patches for a feature which require a parameter to
    query whether support exists . These patches cannot be made external
    yet. In order to keep existing tests and userspace happy and free from
    conflicts, reserve a number for it.

    Signed-off-by: Ben Widawsky
    Signed-off-by: Daniel Vetter

    Ben Widawsky
     

17 Aug, 2012

1 commit

  • In order for udl vmap to work properly, we need to push the object
    into the CPU domain before we start copying the data to the USB device.

    This along with the udl change avoids userspace explicit mapping to
    be used.

    v2: add a flag for userspace to query to know if Intel kernel driver can
    deal with the vmap flushing properly. In theory udl would need a flag also,
    but I intend to push the patches very close to each other and other drivers
    should do the right thing from the start.

    I've added a test to my intel-gpu-tools prime branch, however testing
    this is a bit messy since the only way to get udl to vmap is to rendering
    something. I've tested this with real code as well to make sure it works.

    Signed-off-by: Dave Airlie
    [danvet: resolved conflict, which required reallocating the PARAM
    number to 21.]
    Signed-off-by: Daniel Vetter

    Dave Airlie
     

08 Aug, 2012

1 commit

  • Userspace tries to estimate the cost of ring switching based on whether
    the GPU and GEM supports semaphores. (If we have multiple rings and no
    semaphores, userspace assumes that the cost of switching rings between
    batches is exorbitant and will endeavour to keep the next batch on the
    active ring - as a coarse approximation to tracking both destination and
    source surfaces.) Currently userspace has to guess whether semaphores
    exist based on the chipset generation and the module parameter,
    i915.semaphores. This is a crude and inaccurate guess as the defaults
    internally depend upon other chipset features being enabled or disabled,
    nor does it extend well into the future. By exporting a HAS_SEMAPHORES
    parameter, we can easily query the driver and obtain an accurate answer.

    Signed-off-by: Chris Wilson
    Signed-off-by: Daniel Vetter

    Chris Wilson
     

26 Jul, 2012

4 commits

  • By selecting the cache level (essentially whether or not the CPU snoops
    any updates to the bo, and on more recent machines whether it resides
    inside the CPU's last-level-cache) a userspace driver is able to then
    manage all of its memory within buffer objects, if it so desires. This
    enables the userspace driver to accelerate uploads and more importantly
    downloads from the GPU and to able to mix CPU and GPU rendering/activity
    efficiently.

    Signed-off-by: Chris Wilson
    [danvet: Added code comment about where we plan to stuff platform
    specific cacheing control bits in the ioctl struct.]
    Signed-off-by: Daniel Vetter

    Chris Wilson
     
  • The intention is to help select which engine to use for copies with
    interoperating clients - such as a GL client making a request to the X
    server to perform a SwapBuffers, which may require copying from the
    active GL back buffer to the X front buffer.

    We choose to report a mask of the active rings to future proof the
    interface against any changes which may allow for the object to reside
    upon multiple rings.

    Signed-off-by: Chris Wilson
    [danvet: bikeshed away the write ring mask and add the explanation
    Chris sent in a follow-up mail why we decided to use masks.]
    Signed-off-by: Daniel Vetter

    Chris Wilson
     
  • The interface's immediate purpose is to do synchronous timestamp queries
    as required by GL_TIMESTAMP. The GPU has a register for reading the
    timestamp but because that would normally require root access through
    libpciaccess, the IOCTL can provide this service instead.

    Currently the implementation whitelists only the render ring timestamp
    register, because that is the only thing we need to expose at this time.

    v2: make size implicit based on the register offset
    Add a generation check

    Reviewed-by: Eric Anholt
    Cc: Jacek Lawrynowicz
    Signed-off-by: Ben Widawsky
    [danvet: fixup the ioctl numerb:]
    Signed-off-by: Daniel Vetter

    Ben Widawsky
     
  • I'm planing to merge this next week for 3.7, but I'd like to avoid
    stupid conflicts with the exsting userspace when merging the new
    reg_read ioctl (which doesn't have userspace yet, but this caching
    interface has).

    Header extracted from Chris Wilson's patch, but fix up the copy&pasted
    comment in the interface struct.

    Signed-off-by: Chris Wilson
    Signed-off-by: Daniel Vetter

    Daniel Vetter
     

14 Jun, 2012

2 commits

  • Use the rsvd1 field in execbuf2 to specify the context ID associated
    with the workload. This will allow the driver to do the proper context
    switch when/if needed.

    v2: Add checks for context switches on rings not supporting contexts.
    Before the code would silently ignore such requests.

    Signed-off-by: Ben Widawsky

    Ben Widawsky
     
  • Add the interfaces to allow user space to create and destroy contexts.
    Contexts are destroyed automatically if the file descriptor for the dri
    device is closed.

    Following convention as usual here causes checkpatch warnings.

    v2: with is_initialized, no longer need to init at create
    drop the context switch on create (daniel)

    v3: Use interruptible lock (Chris)
    return -ENODEV in !GEM case (Chris)

    Signed-off-by: Ben Widawsky

    Ben Widawsky
     

06 Jun, 2012

2 commits

  • Signed-off-by: Ben Widawsky
    Reviewed-by: Chris Wilson
    Signed-off-by: Daniel Vetter

    Ben Widawsky
     
  • Change the ns_timeout parameter of the wait ioctl to a signed value.
    Doing this allows the kernel to provide an infinite wait when a timeout
    of less than 0 is provided. This mimics select/poll.

    Initially the parameter was meant to match up with the GL spec 1:1, but
    after being made aware of how much 2^64 - 1 nanoseconds actually is, I
    do not think anyone will ever notice the loss of 1 bit.

    The infinite timeout on waiting is similar to the existing i915
    userspace interface with the exception that struct_mutex is dropped
    while doing the wait in this ioctl.

    Cc: Chris Wilson
    Signed-off-by: Ben Widawsky
    Reviewed-by: Chris Wilson
    Signed-off-by: Daniel Vetter

    Ben Widawsky
     

25 May, 2012

1 commit

  • This helps implement GL_ARB_sync but stops short of allowing full blown
    sync objects. Finally we can use the new timed seqno waiting function
    to allow userspace to wait on a buffer object with a timeout. This
    implements that interface.

    The IOCTL will take as input a buffer object handle, and a timeout in
    nanoseconds (flags is currently optional but will likely be used for
    permutations of flush operations). Users may specify 0 nanoseconds to
    instantly check.

    The wait ioctl with a timeout of 0 reimplements the busy ioctl. With any
    non-zero timeout parameter the wait ioctl will wait for the given number
    of nanoseconds on an object becoming unbusy. Since the wait itself does
    so holding struct_mutex the object may become re-busied before this
    completes. A similar but shorter race condition exists in the busy
    ioctl.

    v2: ETIME/ERESTARTSYS instead of changing to EBUSY, and EGAIN (Chris)
    Flush the object from the gpu write domain (Chris + Daniel)
    Fix leaked refcount in good case (Chris)
    Naturally align ioctl struct (Chris)

    v3: Drop lock after getting seqno to avoid ugly dance (Chris)

    v4: check for 0 timeout after olr check to allow polling (Chris)

    v5: Updated the comment. (Chris)

    v6: Return -ETIME instead of -EBUSY when timeout_ns is 0 (Daniel)
    Fix the commit message comment to be less ugly (Ben)
    Add a warning to check the return timespec (Ben)

    v7: Use DRM_AUTH for the ioctl. (Eugeni)

    Signed-off-by: Ben Widawsky
    Signed-off-by: Daniel Vetter

    Ben Widawsky
     

21 Mar, 2012

1 commit

  • On Sanybridge a few MI read/write commands only work when ppgtt is
    enabled. Userspace therefore needs to be able to check whether ppgtt
    is enabled. For added hilarity, you need to reset the "use global GTT"
    bit on snb when ppgtt is enabled, otherwise it won't work. Despite
    what bspec says about automatically using ppgtt ...

    Luckily PIPE_CONTROL (the only write cmd current userspace uses) is
    not affected by all this, as tested by tests/gem_pipe_control_store_loop.

    Reviewed-and-tested-by: Chris Wilson
    Signed-Off-by: Daniel Vetter

    Daniel Vetter
     

18 Jan, 2012

1 commit


04 Jan, 2012

2 commits

  • These registers are automatically incremented by the hardware during
    transform feedback to track where the next streamed vertex output
    should go. Unlike the previous generation, which had a packet for
    setting the corresponding registers to a defined value, gen7 only has
    MI_LOAD_REGISTER_IMM to do so. That's a secure packet (since it loads
    an arbitrary register), so we need to do it from the kernel, and it
    needs to be settable atomically with the batchbuffer execution so that
    two clients doing transform feedback don't stomp on each others'
    state.

    Instead of building a more complicated interface involcing setting the
    registers to a specific value, just set them to 0 when asked and
    userland can tweak its pointers accordingly.

    Signed-off-by: Eric Anholt
    Reviewed-by: Eugeni Dodonov
    Reviewed-by: Kenneth Graunke
    Signed-off-by: Keith Packard

    Eric Anholt
     
  • Add new ioctls for getting and setting the current destination color
    key. This allows for simple overlay display control by matching a color
    key value in the primary plane before blending the overlay on top.

    v2: remove unnecessary mutex acquire/release around reg accesses
    v3: add support for full color key management
    v4: fix copy & paste bug in snb_get_colorkey
    don't bother checking min/max values against docs as the docs are likely
    wrong (how could we handle 10bpc surface formats?)

    Reviewed-by: Daniel Vetter
    Signed-off-by: Jesse Barnes

    Jesse Barnes
     

23 Jul, 2011

1 commit

  • Because of a typo, calling ioctl with DRM_IOCTL_I915_OVERLAY_PUT_IMAGE
    is broken if the macro is used directly. When using libdrm the bug is
    not hit, since libdrm handles the ioctl encoding internally.

    The typo also leads to the .cmd and .cmd_drv fields of the drm_ioctl
    structure for DRM_I915_OVERLAY_PUT_IMAGE having inconsistent content.

    Signed-off-by: Ole Henrik Jahren
    Acked-by: Daniel Vetter
    Cc: stable@kernel.org
    Signed-off-by: Keith Packard

    Ole Henrik Jahren
     

02 Mar, 2011

1 commit


20 Dec, 2010

1 commit


05 Dec, 2010

1 commit

  • Otherwise we can't really fix the abi-braindeadness of forcing
    libva to manually wait for rendering when switching rings. Which
    in turn makes implementing hw semaphores a pointless exercise
    (at least for ironlake).

    [Also added the relaxed fencing param to explain the jump in
    numbering - relaxed fencing is in -next.]

    Signed-off-by: Daniel Vetter
    Signed-off-by: Chris Wilson

    Daniel Vetter
     

22 Oct, 2010

1 commit


24 Aug, 2010

1 commit

  • * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (33 commits)
    drm/radeon/kms: fix typo in radeon_compute_pll_gain
    drm/radeon/kms: try to detect tv vs monitor for underscan
    drm/radeon/kms: fix sideport detection on newer rs880 boards
    drm/radeon: fix passing wrong type to gem object create.
    drm/radeon/kms: set encoder type to DVI for HDMI on evergreen
    drm/radeon/kms: add back missing break in info ioctl
    drm/radeon/kms: don't enable MSIs on AGP boards
    drm/radeon/kms: fix agp mode setup on cards that use pcie bridges
    drm: move dereference below check
    drm: fix end of loop test
    drm/radeon/kms: rework radeon_dp_detect() logic
    drm/radeon/kms: add missing asic callback assignment for evergreen
    drm/radeon/kms/DCE3+: switch pads to ddc mode when going i2c
    drm/radeon/kms/pm: bail early if nothing's changing
    drm/radeon/kms/atom: clean up dig atom handling
    drm/radeon/kms: DCE3/4 transmitter fixes
    drm/radeon/kms: rework encoder handling
    drm/radeon/kms: DCE3/4 AdjustPixelPll updates
    drm/radeon: Fix stack data leak
    drm/radeon/kms: fix GTT/VRAM overlapping test
    ...

    Linus Torvalds
     

17 Aug, 2010

1 commit

  • With the current screwed but its ABI, ioctls for the drm, Linus pointed out that we could allow userspace to specify the allocation size, but we pass it to the driver which then uses it blindly to store a struct. Now if userspace specifies the allocation size as smaller than the driver needs, the driver can possibly overwrite memory.

    This patch restructures the driver ioctls so we store the structure size we are expecting, and make sure we allocate at least that size. The copy from/to userspace are still restricted to the size the user specifies, this allows ioctl structs to grow on both sides of the equation.

    Up until now we didn't really use the DRM_IOCTL defines in the kernel, so this cleans them up and adds them for nouveau.

    v2:
    fix nouveau pushbuf arg (thanks to Ben for pointing it out)

    Reported-by: Linus Torvalds
    Signed-off-by: Dave Airlie

    Dave Airlie
     

03 Aug, 2010

1 commit

  • Intel Core i3/5 platforms with integrated graphics support both CPU and
    GPU turbo mode. CPU turbo mode is opportunistic: the CPU will use any
    available power to increase core frequencies if thermal headroom is
    available. The GPU side is more manual however; the graphics driver
    must monitor GPU power and temperature and coordinate with a core
    thermal driver to take advantage of available thermal and power headroom
    in the package.

    The intelligent power sharing (IPS) driver is intended to coordinate
    this activity by monitoring MCP (multi-chip package) temperature and
    power, allowing the CPU and/or GPU to increase their power consumption,
    and thus performance, when possible. The goal is to maximize
    performance within a given platform's TDP (thermal design point).

    Signed-off-by: Jesse Barnes
    Signed-off-by: Matthew Garrett

    Jesse Barnes
     

02 Jun, 2010

1 commit


27 May, 2010

1 commit

  • Introduces a more complete intel_ring_buffer structure with callbacks
    for setup and management of a particular ringbuffer, and converts the
    render ring buffer consumers to use it.

    Signed-off-by: Zou Nan hai
    Signed-off-by: Xiang Hai hao
    [anholt: Fixed up whitespace fail and rebased against prep patches]
    Signed-off-by: Eric Anholt

    Zou Nan hai
     

07 Jan, 2010

1 commit

  • This patch adds a new execbuf ioctl, execbuf2, for use by clients that
    want to control fence register allocation more finely. The buffer
    passed in to the new ioctl includes a new relocation type to indicate
    whether a given object needs a fence register assigned for the command
    buffer in question.

    Compatibility with the existing execbuf ioctl is implemented in terms
    of the new code, preserving the assumption that fence registers are
    required for pre-965 rendering commands.

    Signed-off-by: Jesse Barnes
    [ickle: Remove pre-emptive clear_fence_reg()]
    Signed-off-by: Chris Wilson
    Signed-off-by: Kristian Høgsberg
    [anholt: Removed dmesg spam]
    Signed-off-by: Eric Anholt

    Jesse Barnes
     

08 Dec, 2009

1 commit


04 Dec, 2009

1 commit


02 Dec, 2009

1 commit