Hello all,
For a while now, I've been working to fix tearing with PRIME. This is the fifth version of the DRM component for PRIME synchronization.
v1 was a more complicated patch set that added an additional fenced interface to page flipping. To avoid adding additional interfaces on top of a legacy path, v2 scrapped those patches and changed i915 page flipping paths to wait on fences attached to DMA-BUF-backed fbs. Subsequent versions involve incremental changes outlined in the patch descriptions.
Repeat of overview below:
I have two patches, one that implements fencing for i915's legacy mmio_flip path, and one for atomic modesetting for futureproofing. Currently the mmio_flip path is the one ultimately used by the X patches, due to the lack of asynchronous atomic modesetting support in i915.
With my synchronization patches to X, it is possible to export two shared buffers per crtc instead of just one. The sink driver uses the legacy drmModePageFlip() to flip between the buffers, as the rest of the driver has yet to be ported to atomics. In the pageflip/vblank event handler, the sink driver requests a present from the source using the new X ABI function pScreen->PresentTrackedFlippingPixmap(). If the call returns successfully, it uses drmModePageFlip() to flip to the updated buffer, otherwise it waits until the next vblank and tries again.
When the source driver presents on a given buffer, it first attaches a fence. The source driver is responsible for either using software signaling or hardware semaphore-backed fences to ensure the fence is signaled when the present is finished. If the sink's DRM driver implements fencing in the flipping path, it will guarantee that that flip won't occur until the present has finished.
This means that DRM drivers that don't implement fencing in their flipping paths won't be able to guarantee 100% tear-free PRIME with my X patches. However, the good news is that even without fencing, tearing is rare. Generally presenting finishes before the next vblank, so there is no need to wait on the fence. The X patches are a drastic improvement with or without fencing, but the fencing is nonetheless important to guarantee tear-free under all conditions.
To give some greater context, I've uploaded my branches for DRM and the X server to Github. I'll move forward with upstreaming the X changes if and when these DRM patches go in.
DRM Tree: https://github.com/GoinsWithTheWind/drm-prime-sync X Tree: https://github.com/GoinsWithTheWind/xserver-prime-sync
(branch agoins-prime-v5)
Thanks, Alex @ NVIDIA Linux Driver Team
Alex Goins (2): i915: wait for fence in mmio_flip_work_func i915: wait for fence in prepare_plane_fb
drivers/gpu/drm/i915/intel_display.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
If a buffer is backed by dmabuf, wait on its reservation object's exclusive fence before flipping.
v2: First commit v3: Remove object_name_lock acquire v4: Move wait ahead of mark_page_flip_active Use crtc->primary->fb to get GEM object instead of pending_flip_obj use_mmio_flip() return true when exclusive fence is attached Wait only on exclusive fences, interruptible with no timeout v5: Move wait from do_mmio_flip to mmio_flip_work_func Style tweaks to more closely match rest of file
Signed-off-by: Alex Goins agoins@nvidia.com --- drivers/gpu/drm/i915/intel_display.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index b2270d5..eef3475 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -44,6 +44,8 @@ #include <drm/drm_plane_helper.h> #include <drm/drm_rect.h> #include <linux/dma_remapping.h> +#include <linux/reservation.h> +#include <linux/dma-buf.h>
/* Primary plane formats for gen <= 3 */ static const uint32_t i8xx_primary_formats[] = { @@ -11088,6 +11090,8 @@ static bool use_mmio_flip(struct intel_engine_cs *ring, return true; else if (i915.enable_execlists) return true; + else if (obj->base.dma_buf && obj->base.dma_buf->resv->fence_excl) + return true; else return ring != i915_gem_request_get_ring(obj->last_write_req); } @@ -11189,6 +11193,9 @@ static void intel_mmio_flip_work_func(struct work_struct *work) { struct intel_mmio_flip *mmio_flip = container_of(work, struct intel_mmio_flip, work); + struct intel_framebuffer *intel_fb = + to_intel_framebuffer(mmio_flip->crtc->base.primary->fb); + struct drm_i915_gem_object *obj = intel_fb->obj;
if (mmio_flip->req) WARN_ON(__i915_wait_request(mmio_flip->req, @@ -11196,6 +11203,12 @@ static void intel_mmio_flip_work_func(struct work_struct *work) false, NULL, &mmio_flip->i915->rps.mmioflips));
+ /* For framebuffer backed by dmabuf, wait for fence */ + if (obj->base.dma_buf) + reservation_object_wait_timeout_rcu(obj->base.dma_buf->resv, + false, true, + MAX_SCHEDULE_TIMEOUT); + intel_do_mmio_flip(mmio_flip->crtc);
i915_gem_request_unreference__unlocked(mmio_flip->req);
In intel_prepare_plane_fb, if fb is backed by dma-buf, wait for exclusive fence
v2: First commit v3: Remove object_name_lock acquire Move wait from intel_atomic_commit() to intel_prepare_plane_fb() v4: Wait only on exclusive fences, interruptible with no timeout v5: Style tweaks to more closely match rest of file
Signed-off-by: Alex Goins agoins@nvidia.com --- drivers/gpu/drm/i915/intel_display.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index eef3475..f5ab8a7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13347,6 +13347,12 @@ intel_prepare_plane_fb(struct drm_plane *plane, if (!obj) return 0;
+ /* For framebuffer backed by dmabuf, wait for fence */ + if (obj->base.dma_buf) + reservation_object_wait_timeout_rcu(obj->base.dma_buf->resv, + false, true, + MAX_SCHEDULE_TIMEOUT); + mutex_lock(&dev->struct_mutex);
if (plane->type == DRM_PLANE_TYPE_CURSOR &&
Op 20-11-15 om 23:09 schreef Alex Goins:
In intel_prepare_plane_fb, if fb is backed by dma-buf, wait for exclusive fence
v2: First commit v3: Remove object_name_lock acquire Move wait from intel_atomic_commit() to intel_prepare_plane_fb() v4: Wait only on exclusive fences, interruptible with no timeout v5: Style tweaks to more closely match rest of file
Signed-off-by: Alex Goins agoins@nvidia.com
drivers/gpu/drm/i915/intel_display.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index eef3475..f5ab8a7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13347,6 +13347,12 @@ intel_prepare_plane_fb(struct drm_plane *plane, if (!obj) return 0;
- /* For framebuffer backed by dmabuf, wait for fence */
- if (obj->base.dma_buf)
reservation_object_wait_timeout_rcu(obj->base.dma_buf->resv,
false, true,
MAX_SCHEDULE_TIMEOUT);
This is missing -ERESTARTSYS handling for interruptible waits here.
Full error handling could be fun with radeon fences, which may return -EDEADLK. Maybe it's best to only warn if an error other than -ERESTARTSYS is returned, and continue. There's not much recovery we can do if someone else's fence returns an error.
dri-devel@lists.freedesktop.org