Hello all,
For a while now, I've been working to fix tearing with PRIME. This is the same as the eighth version of the DRM component for PRIME synchronization,
In this version, use_mmio_flip() tests against !reservation_object_test_signaled_rcu(test_all=FALSE) instead of directly checking for an exclusive fence with obj->base.dma_buf->resv->fence_excl.
Repeat of overview below:
v1 was a more complicated patch set that added an additional fenced interface to page flipping. To avoid adding additional interfaces on top of a legacy path, v2 scrapped those patches and changed i915 page flipping paths to wait on fences attached to DMA-BUF-backed fbs. Subsequent versions involve incremental changes outlined in the patch descriptions.
I have two patches, one that implements fencing for i915's legacy mmio_flip path, and one for atomic modesetting for futureproofing. Currently the mmio_flip path is the one ultimately used by the X patches, due to the lack of asynchronous atomic modesetting support in i915.
With my synchronization patches to X, it is possible to export two shared buffers per crtc instead of just one. The sink driver uses the legacy drmModePageFlip() to flip between the buffers, as the rest of the driver has yet to be ported to atomics. In the pageflip/vblank event handler, the sink driver requests a present from the source using the new X ABI function pScreen->PresentTrackedFlippingPixmap(). If the call returns successfully, it uses drmModePageFlip() to flip to the updated buffer, otherwise it waits until the next vblank and tries again.
When the source driver presents on a given buffer, it first attaches a fence. The source driver is responsible for either using software signaling or hardware semaphore-backed fences to ensure the fence is signaled when the present is finished. If the sink's DRM driver implements fencing in the flipping path, it will guarantee that that flip won't occur until the present has finished.
This means that DRM drivers that don't implement fencing in their flipping paths won't be able to guarantee 100% tear-free PRIME with my X patches. However, the good news is that even without fencing, tearing is rare. Generally presenting finishes before the next vblank, so there is no need to wait on the fence. The X patches are a drastic improvement with or without fencing, but the fencing is nonetheless important to guarantee tear-free under all conditions.
To give some greater context, I've uploaded my branches for DRM and the X server to Github. I'll move forward with upstreaming the X changes if and when these DRM patches go in.
DRM Tree: https://github.com/GoinsWithTheWind/drm-prime-sync X Tree: https://github.com/GoinsWithTheWind/xserver-prime-sync
(branch agoins-prime-v8)
Thanks, Alex @ NVIDIA Linux Driver Team
Alex Goins (2): i915: wait for fence in mmio_flip_work_func i915: wait for fence in prepare_plane_fb
drivers/gpu/drm/i915/intel_display.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
If a buffer is backed by dmabuf, wait on its reservation object's exclusive fence before flipping.
v2: First commit v3: Remove object_name_lock acquire v4: Move wait ahead of mark_page_flip_active Use crtc->primary->fb to get GEM object instead of pending_flip_obj use_mmio_flip() return true when exclusive fence is attached Wait only on exclusive fences, interruptible with no timeout v5: Move wait from do_mmio_flip to mmio_flip_work_func Style tweaks to more closely match rest of file v6: Change back to unintteruptible wait to match __i915_wait_request due to inability to properly handle interrupted wait. Warn on error code from waiting. v7: No change v8: Test for !reservation_object_signaled_rcu(test_all=FALSE) instead of obj->base.dma_buf->resv->fence_excl
Signed-off-by: Alex Goins agoins@nvidia.com --- drivers/gpu/drm/i915/intel_display.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index b2270d5..84cccf8 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -44,6 +44,8 @@ #include <drm/drm_plane_helper.h> #include <drm/drm_rect.h> #include <linux/dma_remapping.h> +#include <linux/reservation.h> +#include <linux/dma-buf.h>
/* Primary plane formats for gen <= 3 */ static const uint32_t i8xx_primary_formats[] = { @@ -11088,6 +11090,10 @@ static bool use_mmio_flip(struct intel_engine_cs *ring, return true; else if (i915.enable_execlists) return true; + else if (obj->base.dma_buf && + !reservation_object_test_signaled_rcu(obj->base.dma_buf->resv, + false)) + return true; else return ring != i915_gem_request_get_ring(obj->last_write_req); } @@ -11189,6 +11195,9 @@ static void intel_mmio_flip_work_func(struct work_struct *work) { struct intel_mmio_flip *mmio_flip = container_of(work, struct intel_mmio_flip, work); + struct intel_framebuffer *intel_fb = + to_intel_framebuffer(mmio_flip->crtc->base.primary->fb); + struct drm_i915_gem_object *obj = intel_fb->obj;
if (mmio_flip->req) WARN_ON(__i915_wait_request(mmio_flip->req, @@ -11196,6 +11205,12 @@ static void intel_mmio_flip_work_func(struct work_struct *work) false, NULL, &mmio_flip->i915->rps.mmioflips));
+ /* For framebuffer backed by dmabuf, wait for fence */ + if (obj->base.dma_buf) + WARN_ON(reservation_object_wait_timeout_rcu(obj->base.dma_buf->resv, + false, false, + MAX_SCHEDULE_TIMEOUT) < 0); + intel_do_mmio_flip(mmio_flip->crtc);
i915_gem_request_unreference__unlocked(mmio_flip->req);
In intel_prepare_plane_fb, if fb is backed by dma-buf, wait for exclusive fence
v2: First commit v3: Remove object_name_lock acquire Move wait from intel_atomic_commit() to intel_prepare_plane_fb() v4: Wait only on exclusive fences, interruptible with no timeout v5: Style tweaks to more closely match rest of file v6: Properly handle interrupted waits v7: No change v8: No change
Signed-off-by: Alex Goins agoins@nvidia.com --- drivers/gpu/drm/i915/intel_display.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 84cccf8..a007a6a 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13349,6 +13349,17 @@ intel_prepare_plane_fb(struct drm_plane *plane, if (!obj) return 0;
+ /* For framebuffer backed by dmabuf, wait for fence */ + if (obj->base.dma_buf) { + ret = reservation_object_wait_timeout_rcu(obj->base.dma_buf->resv, + false, true, + MAX_SCHEDULE_TIMEOUT); + if (ret == -ERESTARTSYS) + return ret; + + WARN_ON(ret < 0); + } + mutex_lock(&dev->struct_mutex);
if (plane->type == DRM_PLANE_TYPE_CURSOR &&
Any more feedback on this?
Thanks, Alex
On Thu, 26 Nov 2015, Alex Goins wrote:
Hello all,
For a while now, I've been working to fix tearing with PRIME. This is the same as the eighth version of the DRM component for PRIME synchronization,
In this version, use_mmio_flip() tests against !reservation_object_test_signaled_rcu(test_all=FALSE) instead of directly checking for an exclusive fence with obj->base.dma_buf->resv->fence_excl.
Repeat of overview below:
v1 was a more complicated patch set that added an additional fenced interface to page flipping. To avoid adding additional interfaces on top of a legacy path, v2 scrapped those patches and changed i915 page flipping paths to wait on fences attached to DMA-BUF-backed fbs. Subsequent versions involve incremental changes outlined in the patch descriptions.
I have two patches, one that implements fencing for i915's legacy mmio_flip path, and one for atomic modesetting for futureproofing. Currently the mmio_flip path is the one ultimately used by the X patches, due to the lack of asynchronous atomic modesetting support in i915.
With my synchronization patches to X, it is possible to export two shared buffers per crtc instead of just one. The sink driver uses the legacy drmModePageFlip() to flip between the buffers, as the rest of the driver has yet to be ported to atomics. In the pageflip/vblank event handler, the sink driver requests a present from the source using the new X ABI function pScreen->PresentTrackedFlippingPixmap(). If the call returns successfully, it uses drmModePageFlip() to flip to the updated buffer, otherwise it waits until the next vblank and tries again.
When the source driver presents on a given buffer, it first attaches a fence. The source driver is responsible for either using software signaling or hardware semaphore-backed fences to ensure the fence is signaled when the present is finished. If the sink's DRM driver implements fencing in the flipping path, it will guarantee that that flip won't occur until the present has finished.
This means that DRM drivers that don't implement fencing in their flipping paths won't be able to guarantee 100% tear-free PRIME with my X patches. However, the good news is that even without fencing, tearing is rare. Generally presenting finishes before the next vblank, so there is no need to wait on the fence. The X patches are a drastic improvement with or without fencing, but the fencing is nonetheless important to guarantee tear-free under all conditions.
To give some greater context, I've uploaded my branches for DRM and the X server to Github. I'll move forward with upstreaming the X changes if and when these DRM patches go in.
DRM Tree: https://github.com/GoinsWithTheWind/drm-prime-sync X Tree: https://github.com/GoinsWithTheWind/xserver-prime-sync
(branch agoins-prime-v8)
Thanks, Alex @ NVIDIA Linux Driver Team
Alex Goins (2): i915: wait for fence in mmio_flip_work_func i915: wait for fence in prepare_plane_fb
drivers/gpu/drm/i915/intel_display.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
-- 1.9.1
On Tue, Dec 08, 2015 at 08:23:23PM -0800, Alex Goins wrote:
Any more feedback on this?
Maarten reviewed them and they already landed in drm-intel-next-queued. I guess he just forgot to send out a quick reply.
Thanks, Daniel
Thanks, Alex
On Thu, 26 Nov 2015, Alex Goins wrote:
Hello all,
For a while now, I've been working to fix tearing with PRIME. This is the same as the eighth version of the DRM component for PRIME synchronization,
In this version, use_mmio_flip() tests against !reservation_object_test_signaled_rcu(test_all=FALSE) instead of directly checking for an exclusive fence with obj->base.dma_buf->resv->fence_excl.
Repeat of overview below:
v1 was a more complicated patch set that added an additional fenced interface to page flipping. To avoid adding additional interfaces on top of a legacy path, v2 scrapped those patches and changed i915 page flipping paths to wait on fences attached to DMA-BUF-backed fbs. Subsequent versions involve incremental changes outlined in the patch descriptions.
I have two patches, one that implements fencing for i915's legacy mmio_flip path, and one for atomic modesetting for futureproofing. Currently the mmio_flip path is the one ultimately used by the X patches, due to the lack of asynchronous atomic modesetting support in i915.
With my synchronization patches to X, it is possible to export two shared buffers per crtc instead of just one. The sink driver uses the legacy drmModePageFlip() to flip between the buffers, as the rest of the driver has yet to be ported to atomics. In the pageflip/vblank event handler, the sink driver requests a present from the source using the new X ABI function pScreen->PresentTrackedFlippingPixmap(). If the call returns successfully, it uses drmModePageFlip() to flip to the updated buffer, otherwise it waits until the next vblank and tries again.
When the source driver presents on a given buffer, it first attaches a fence. The source driver is responsible for either using software signaling or hardware semaphore-backed fences to ensure the fence is signaled when the present is finished. If the sink's DRM driver implements fencing in the flipping path, it will guarantee that that flip won't occur until the present has finished.
This means that DRM drivers that don't implement fencing in their flipping paths won't be able to guarantee 100% tear-free PRIME with my X patches. However, the good news is that even without fencing, tearing is rare. Generally presenting finishes before the next vblank, so there is no need to wait on the fence. The X patches are a drastic improvement with or without fencing, but the fencing is nonetheless important to guarantee tear-free under all conditions.
To give some greater context, I've uploaded my branches for DRM and the X server to Github. I'll move forward with upstreaming the X changes if and when these DRM patches go in.
DRM Tree: https://github.com/GoinsWithTheWind/drm-prime-sync X Tree: https://github.com/GoinsWithTheWind/xserver-prime-sync
(branch agoins-prime-v8)
Thanks, Alex @ NVIDIA Linux Driver Team
Alex Goins (2): i915: wait for fence in mmio_flip_work_func i915: wait for fence in prepare_plane_fb
drivers/gpu/drm/i915/intel_display.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
-- 1.9.1
dri-devel@lists.freedesktop.org