The main idea behind DRM_CAP_RELEASE_FENCE is to add an additional signaling mechanism for a pageflip completion in addition to out_fence or DRM_EVENT_FLIP_COMPLETE event. This allows a compositor to start a new repaint cycle with a new buffer instead of waiting for the old buffer to be free.
Why? So, an atomic pageflip completion indicates two things to a compositor: - that it can repaint again and - that the old fb is free and can be reused (that was submitted in the previous repaint cycle)
Essentially, DRM_CAP_RELEASE_FENCE is about separating out the above two assumptions. DRM_EVENT_FLIP_COMPLETE event or out_fence would serve as a signal to repaint and newly added release_fence would provide a way to determine when old fbs can be re-used again.
This separation is really needed when the fb(s) associated with a pageflip are shared outside of the OS -- which is indeed the case with Virtio-gpu, a Virtual KMS driver. The Virtio-gpu driver runs in a Virtual Machine and can share the fb with the Host -- via Wayland UI -- in a zero-copy way. And, in this particular environment where the Host and Guest/VM are running Wayland based compositors, it would be desirable to have the Guest compositor's scanout fb be placed directly on a hardware plane on the Host -- to improve performance when there are multiple Guests running. To ensure 60 FPS and to prevent Guest and Host compositors from using an fb at the same time, the signaling of Guest's release_fence is tied to Host's wl_buffer_release event and DRM_EVENT_FLIP_COMPLETE/ out_fence signaling is tied to Host compositor's frame callback event.
Implementation: Since release_fence is almost identical to out_fence, it is implemented by making use of the existing out_fence machinery. And, although, the drm core creates the release_fence, the Virtio-gpu driver takes care of signaling it when it gets notified by the Host that the fb is free.
This work is based on the idea/suggestion from Simon and Pekka.
And, this patch series provides a solution for this Weston issue: https://gitlab.freedesktop.org/wayland/weston/-/issues/514
Tested with: Weston MR: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668 and Qemu patches: https://lists.nongnu.org/archive/html/qemu-devel/2021-09/msg03463.html
Earlier version/discussion of this patch series can be found at: https://lists.freedesktop.org/archives/dri-devel/2021-July/317672.html
Cc: Daniel Vetter daniel@ffwll.ch Cc: Gerd Hoffmann kraxel@redhat.com Cc: Pekka Paalanen pekka.paalanen@collabora.com Cc: Simon Ser contact@emersion.fr Cc: Michel Dänzer michel@daenzer.net Cc: Tina Zhang tina.zhang@intel.com Cc: Dongwon Kim dongwon.kim@intel.com Cc: Satyeshwar Singh satyeshwar.singh@intel.com
Vivek Kasireddy (6): drm/atomic: Move out_fence creation/setup into a separate function drm/atomic: Add support for release_fence and its associated property drm: Add a capability flag to support additional flip completion signalling drm/virtio: Probe and implement VIRTIO_GPU_F_RELEASE_FENCE feature drm/virtio: Prepare set_scanout_blob to accept a fence drm/virtio: Add a fence to set_scanout_blob
drivers/gpu/drm/drm_atomic_uapi.c | 100 ++++++++++++++++++----- drivers/gpu/drm/drm_crtc.c | 2 + drivers/gpu/drm/drm_ioctl.c | 3 + drivers/gpu/drm/drm_mode_config.c | 6 ++ drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.h | 5 +- drivers/gpu/drm/virtio/virtgpu_fence.c | 9 ++ drivers/gpu/drm/virtio/virtgpu_kms.c | 9 +- drivers/gpu/drm/virtio/virtgpu_plane.c | 63 ++++++++++++-- drivers/gpu/drm/virtio/virtgpu_vq.c | 7 +- include/drm/drm_atomic.h | 1 + include/drm/drm_file.h | 9 ++ include/drm/drm_mode_config.h | 15 ++++ include/uapi/drm/drm.h | 1 + include/uapi/linux/virtio_gpu.h | 2 + 16 files changed, 200 insertions(+), 34 deletions(-)
This is needed to leverage the out_fence machinery for similar but additional singalling mechanisms.
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com --- drivers/gpu/drm/drm_atomic_uapi.c | 57 ++++++++++++++++++++----------- 1 file changed, 37 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c index 909f31833181..6436677fa2f8 100644 --- a/drivers/gpu/drm/drm_atomic_uapi.c +++ b/drivers/gpu/drm/drm_atomic_uapi.c @@ -1135,6 +1135,38 @@ static int setup_out_fence(struct drm_out_fence_state *fence_state, return 0; }
+static struct dma_fence *crtc_create_out_fence(struct drm_crtc *crtc, + struct drm_out_fence_state **fence_state, + s32 __user *fence_ptr, + unsigned int *num_fences) +{ + struct dma_fence *fence; + struct drm_out_fence_state *f; + int ret; + + f = krealloc(*fence_state, sizeof(**fence_state) * + (*num_fences + 1), GFP_KERNEL); + if (!f) + return ERR_PTR(-ENOMEM); + + memset(&f[*num_fences], 0, sizeof(*f)); + + f[*num_fences].out_fence_ptr = fence_ptr; + *fence_state = f; + + fence = drm_crtc_create_fence(crtc); + if (!fence) + return ERR_PTR(-ENOMEM); + + ret = setup_out_fence(&f[(*num_fences)++], fence); + if (ret) { + dma_fence_put(fence); + return ERR_PTR(ret); + } + + return fence; +} + static int prepare_signaling(struct drm_device *dev, struct drm_atomic_state *state, struct drm_mode_atomic *arg, @@ -1152,6 +1184,7 @@ static int prepare_signaling(struct drm_device *dev, return 0;
for_each_new_crtc_in_state(state, crtc, crtc_state, i) { + struct dma_fence *fence; s32 __user *fence_ptr;
fence_ptr = get_out_fence_for_crtc(crtc_state->state, crtc); @@ -1182,28 +1215,12 @@ static int prepare_signaling(struct drm_device *dev, }
if (fence_ptr) { - struct dma_fence *fence; - struct drm_out_fence_state *f; - - f = krealloc(*fence_state, sizeof(**fence_state) * - (*num_fences + 1), GFP_KERNEL); - if (!f) - return -ENOMEM; - - memset(&f[*num_fences], 0, sizeof(*f)); + fence = crtc_create_out_fence(crtc, fence_state, + fence_ptr, num_fences); + if (IS_ERR(fence)) + return PTR_ERR(fence);
- f[*num_fences].out_fence_ptr = fence_ptr; - *fence_state = f;
- fence = drm_crtc_create_fence(crtc); - if (!fence) - return -ENOMEM; - - ret = setup_out_fence(&f[(*num_fences)++], fence); - if (ret) { - dma_fence_put(fence); - return ret; - }
crtc_state->event->base.fence = fence; }
Release_fence is very similar if not the same as out_fence; it is an additional signalling mechanism for a page flip completion.
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com --- drivers/gpu/drm/drm_atomic_uapi.c | 43 +++++++++++++++++++++++++++++-- drivers/gpu/drm/drm_crtc.c | 2 ++ drivers/gpu/drm/drm_mode_config.c | 6 +++++ include/drm/drm_atomic.h | 1 + include/drm/drm_file.h | 9 +++++++ include/drm/drm_mode_config.h | 7 +++++ 6 files changed, 66 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c index 6436677fa2f8..5d0bf3e525b3 100644 --- a/drivers/gpu/drm/drm_atomic_uapi.c +++ b/drivers/gpu/drm/drm_atomic_uapi.c @@ -367,6 +367,23 @@ static s32 __user *get_out_fence_for_crtc(struct drm_atomic_state *state, return fence_ptr; }
+static void set_release_fence_for_crtc(struct drm_atomic_state *state, + struct drm_crtc *crtc, s32 __user *fence_ptr) +{ + state->crtcs[drm_crtc_index(crtc)].release_fence_ptr = fence_ptr; +} + +static s32 __user *get_release_fence_for_crtc(struct drm_atomic_state *state, + struct drm_crtc *crtc) +{ + s32 __user *fence_ptr; + + fence_ptr = state->crtcs[drm_crtc_index(crtc)].release_fence_ptr; + state->crtcs[drm_crtc_index(crtc)].release_fence_ptr = NULL; + + return fence_ptr; +} + static int set_out_fence_for_connector(struct drm_atomic_state *state, struct drm_connector *connector, s32 __user *fence_ptr) @@ -482,6 +499,16 @@ static int drm_atomic_crtc_set_property(struct drm_crtc *crtc, return -EFAULT;
set_out_fence_for_crtc(state->state, crtc, fence_ptr); + } else if (property == config->prop_release_fence_ptr) { + s32 __user *fence_ptr = u64_to_user_ptr(val); + + if (!fence_ptr) + return 0; + + if (put_user(-1, fence_ptr)) + return -EFAULT; + + set_release_fence_for_crtc(state->state, crtc, fence_ptr); } else if (property == crtc->scaling_filter_property) { state->scaling_filter = val; } else if (crtc->funcs->atomic_set_property) { @@ -519,6 +546,8 @@ drm_atomic_crtc_get_property(struct drm_crtc *crtc, *val = (state->gamma_lut) ? state->gamma_lut->base.id : 0; else if (property == config->prop_out_fence_ptr) *val = 0; + else if (property == config->prop_release_fence_ptr) + *val = 0; else if (property == crtc->scaling_filter_property) *val = state->scaling_filter; else if (crtc->funcs->atomic_get_property) @@ -1185,7 +1214,7 @@ static int prepare_signaling(struct drm_device *dev,
for_each_new_crtc_in_state(state, crtc, crtc_state, i) { struct dma_fence *fence; - s32 __user *fence_ptr; + s32 __user *fence_ptr, *rel_fence_ptr;
fence_ptr = get_out_fence_for_crtc(crtc_state->state, crtc);
@@ -1220,9 +1249,19 @@ static int prepare_signaling(struct drm_device *dev, if (IS_ERR(fence)) return PTR_ERR(fence);
+ crtc_state->event->base.fence = fence; + }
+ rel_fence_ptr = get_release_fence_for_crtc(crtc_state->state, + crtc); + if (rel_fence_ptr) { + fence = crtc_create_out_fence(crtc, fence_state, + rel_fence_ptr, + num_fences); + if (IS_ERR(fence)) + return PTR_ERR(fence);
- crtc_state->event->base.fence = fence; + crtc_state->event->base.release_fence = fence; }
c++; diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c index 26a77a735905..e682ac04f873 100644 --- a/drivers/gpu/drm/drm_crtc.c +++ b/drivers/gpu/drm/drm_crtc.c @@ -312,6 +312,8 @@ static int __drm_crtc_init_with_planes(struct drm_device *dev, struct drm_crtc * drm_object_attach_property(&crtc->base, config->prop_mode_id, 0); drm_object_attach_property(&crtc->base, config->prop_out_fence_ptr, 0); + drm_object_attach_property(&crtc->base, + config->prop_release_fence_ptr, 0); drm_object_attach_property(&crtc->base, config->prop_vrr_enabled, 0); } diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c index 37b4b9f0e468..fc1f5a8d2991 100644 --- a/drivers/gpu/drm/drm_mode_config.c +++ b/drivers/gpu/drm/drm_mode_config.c @@ -297,6 +297,12 @@ static int drm_mode_create_standard_properties(struct drm_device *dev) return -ENOMEM; dev->mode_config.prop_out_fence_ptr = prop;
+ prop = drm_property_create_range(dev, DRM_MODE_PROP_ATOMIC, + "RELEASE_FENCE_PTR", 0, U64_MAX); + if (!prop) + return -ENOMEM; + dev->mode_config.prop_release_fence_ptr = prop; + prop = drm_property_create_object(dev, DRM_MODE_PROP_ATOMIC, "CRTC_ID", DRM_MODE_OBJECT_CRTC); if (!prop) diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h index 1701c2128a5c..00f5fad87757 100644 --- a/include/drm/drm_atomic.h +++ b/include/drm/drm_atomic.h @@ -177,6 +177,7 @@ struct __drm_crtcs_state { struct drm_crtc_commit *commit;
s32 __user *out_fence_ptr; + s32 __user *release_fence_ptr; u64 last_vblank_count; };
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index a3acb7ac3550..ba79e1765721 100644 --- a/include/drm/drm_file.h +++ b/include/drm/drm_file.h @@ -124,6 +124,15 @@ struct drm_pending_event { */ struct dma_fence *fence;
+ /** + * @release_fence: + * + * Optional DMA fence that will be signalled by the drm driver to + * indicate that all references on FBs associated with a page flip + * can be dropped. + */ + struct dma_fence *release_fence; + /** * @file_priv: * diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h index 48b7de80daf5..12b964540069 100644 --- a/include/drm/drm_mode_config.h +++ b/include/drm/drm_mode_config.h @@ -642,6 +642,13 @@ struct drm_mode_config { * value of type s32, and then cast that pointer to u64. */ struct drm_property *prop_out_fence_ptr; + /** + * @prop_release_fence_ptr: Sync File fd pointer that is used as an + * additional flip completion signalling mechanism. Userspace should + * provide a pointer to a value of type s32, and then cast that pointer + * to u64. + */ + struct drm_property *prop_release_fence_ptr; /** * @prop_crtc_id: Default atomic plane property to specify the * &drm_crtc.
If a driver supports this capability, it means that there would be an additional signalling mechanism for a page flip completion in addition to out_fence or DRM_MODE_PAGE_FLIP_EVENT.
This capability may only be relevant for Virtual KMS drivers and is currently used only by virtio-gpu. Also, it can provide a potential solution for: https://gitlab.freedesktop.org/wayland/weston/-/issues/514
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com --- drivers/gpu/drm/drm_ioctl.c | 3 +++ include/drm/drm_mode_config.h | 8 ++++++++ include/uapi/drm/drm.h | 1 + 3 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 8b8744dcf691..8a420844f8bc 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -302,6 +302,9 @@ static int drm_getcap(struct drm_device *dev, void *data, struct drm_file *file_ case DRM_CAP_CRTC_IN_VBLANK_EVENT: req->value = 1; break; + case DRM_CAP_RELEASE_FENCE: + req->value = dev->mode_config.release_fence; + break; default: return -EINVAL; } diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h index 12b964540069..944bebf359d7 100644 --- a/include/drm/drm_mode_config.h +++ b/include/drm/drm_mode_config.h @@ -935,6 +935,14 @@ struct drm_mode_config { */ bool normalize_zpos;
+ /** + * @release_fence: + * + * If this option is set, it means there would be an additional signalling + * mechanism for a page flip completion. + */ + bool release_fence; + /** * @modifiers_property: Plane property to list support modifier/format * combination. diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 3b810b53ba8b..8b8985f65581 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -767,6 +767,7 @@ struct drm_gem_open { * Documentation/gpu/drm-mm.rst, section "DRM Sync Objects". */ #define DRM_CAP_SYNCOBJ_TIMELINE 0x14 +#define DRM_CAP_RELEASE_FENCE 0x15
/* DRM_IOCTL_GET_CAP ioctl argument type */ struct drm_get_cap {
On Mon, 13 Sep 2021 16:35:26 -0700 Vivek Kasireddy vivek.kasireddy@intel.com wrote:
If a driver supports this capability, it means that there would be an additional signalling mechanism for a page flip completion in addition to out_fence or DRM_MODE_PAGE_FLIP_EVENT.
This capability may only be relevant for Virtual KMS drivers and is currently used only by virtio-gpu. Also, it can provide a potential solution for: https://gitlab.freedesktop.org/wayland/weston/-/issues/514
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com
drivers/gpu/drm/drm_ioctl.c | 3 +++ include/drm/drm_mode_config.h | 8 ++++++++ include/uapi/drm/drm.h | 1 + 3 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 8b8744dcf691..8a420844f8bc 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -302,6 +302,9 @@ static int drm_getcap(struct drm_device *dev, void *data, struct drm_file *file_ case DRM_CAP_CRTC_IN_VBLANK_EVENT: req->value = 1; break;
- case DRM_CAP_RELEASE_FENCE:
req->value = dev->mode_config.release_fence;
break;
Hi Vivek,
is this actually necessary?
I would think that userspace figures out the existence of the release fence capability by seeing that the KMS property "RELEASE_FENCE_PTR" either exists or not.
However, would we not need a client cap instead?
If a KMS driver knows that userspace is aware of "RELEASE_FENCE_PTR" and will use it when necessary, then the KMS driver can send the pageflip completion without waiting for the host OS to signal the old buffer as free for re-use.
If the KMS driver does not know that userspace can handle pageflip completing "too early", then it has no choice but to wait until the old buffer is really free before signalling pageflip completion.
Wouldn't that make sense?
Otherwise, this proposal sounds fine to me.
Thanks, pq
default: return -EINVAL; } diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h index 12b964540069..944bebf359d7 100644 --- a/include/drm/drm_mode_config.h +++ b/include/drm/drm_mode_config.h @@ -935,6 +935,14 @@ struct drm_mode_config { */ bool normalize_zpos;
- /**
* @release_fence:
*
* If this option is set, it means there would be an additional signalling
* mechanism for a page flip completion.
*/
- bool release_fence;
- /**
- @modifiers_property: Plane property to list support modifier/format
- combination.
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 3b810b53ba8b..8b8985f65581 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -767,6 +767,7 @@ struct drm_gem_open {
- Documentation/gpu/drm-mm.rst, section "DRM Sync Objects".
*/ #define DRM_CAP_SYNCOBJ_TIMELINE 0x14 +#define DRM_CAP_RELEASE_FENCE 0x15
/* DRM_IOCTL_GET_CAP ioctl argument type */ struct drm_get_cap {
Hi Pekka, Thank you for reviewing this patch.
On Mon, 13 Sep 2021 16:35:26 -0700 Vivek Kasireddy vivek.kasireddy@intel.com wrote:
If a driver supports this capability, it means that there would be an additional signalling mechanism for a page flip completion in addition to out_fence or DRM_MODE_PAGE_FLIP_EVENT.
This capability may only be relevant for Virtual KMS drivers and is currently used only by virtio-gpu. Also, it can provide a potential solution for: https://gitlab.freedesktop.org/wayland/weston/-/issues/514
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com
drivers/gpu/drm/drm_ioctl.c | 3 +++ include/drm/drm_mode_config.h | 8 ++++++++ include/uapi/drm/drm.h | 1 + 3 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 8b8744dcf691..8a420844f8bc 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -302,6 +302,9 @@ static int drm_getcap(struct drm_device *dev, void *data, struct
drm_file *file_
case DRM_CAP_CRTC_IN_VBLANK_EVENT: req->value = 1; break;
- case DRM_CAP_RELEASE_FENCE:
req->value = dev->mode_config.release_fence;
break;
Hi Vivek,
is this actually necessary?
I would think that userspace figures out the existence of the release fence capability by seeing that the KMS property "RELEASE_FENCE_PTR" either exists or not.
[Vivek] Yeah, that makes sense. However, in order for the userspace to not see this property, we'd have to prevent drm core from exposing it; which means we need to check dev->mode_config.release_fence before attaching the property to the crtc.
However, would we not need a client cap instead?
If a KMS driver knows that userspace is aware of "RELEASE_FENCE_PTR" and will use it when necessary, then the KMS driver can send the pageflip completion without waiting for the host OS to signal the old buffer as free for re-use.
[Vivek] Right, the KMS driver can just look at whether the release_fence was added by the userspace (in the atomic commit) to determine whether it needs to wait for the old fb.
If the KMS driver does not know that userspace can handle pageflip completing "too early", then it has no choice but to wait until the old buffer is really free before signalling pageflip completion.
Wouldn't that make sense?
[Vivek] Yes; DRM_CAP_RELEASE_FENCE may not be necessary to implement the behavior you suggest which makes sense.
Otherwise, this proposal sounds fine to me.
[Vivek] Did you get a chance to review the Weston MR: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668
Could you please take a look?
Thanks, Vivek
Thanks, pq
default: return -EINVAL; } diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h index 12b964540069..944bebf359d7 100644 --- a/include/drm/drm_mode_config.h +++ b/include/drm/drm_mode_config.h @@ -935,6 +935,14 @@ struct drm_mode_config { */ bool normalize_zpos;
- /**
* @release_fence:
*
* If this option is set, it means there would be an additional signalling
* mechanism for a page flip completion.
*/
- bool release_fence;
- /**
- @modifiers_property: Plane property to list support modifier/format
- combination.
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 3b810b53ba8b..8b8985f65581 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -767,6 +767,7 @@ struct drm_gem_open {
- Documentation/gpu/drm-mm.rst, section "DRM Sync Objects".
*/ #define DRM_CAP_SYNCOBJ_TIMELINE 0x14 +#define DRM_CAP_RELEASE_FENCE 0x15
/* DRM_IOCTL_GET_CAP ioctl argument type */ struct drm_get_cap {
On Fri, 15 Oct 2021 03:37:01 +0000 "Kasireddy, Vivek" vivek.kasireddy@intel.com wrote:
Hi Pekka, Thank you for reviewing this patch.
Hi Vivek!
On Mon, 13 Sep 2021 16:35:26 -0700 Vivek Kasireddy vivek.kasireddy@intel.com wrote:
If a driver supports this capability, it means that there would be an additional signalling mechanism for a page flip completion in addition to out_fence or DRM_MODE_PAGE_FLIP_EVENT.
This capability may only be relevant for Virtual KMS drivers and is currently used only by virtio-gpu. Also, it can provide a potential solution for: https://gitlab.freedesktop.org/wayland/weston/-/issues/514
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com
drivers/gpu/drm/drm_ioctl.c | 3 +++ include/drm/drm_mode_config.h | 8 ++++++++ include/uapi/drm/drm.h | 1 + 3 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 8b8744dcf691..8a420844f8bc 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -302,6 +302,9 @@ static int drm_getcap(struct drm_device *dev, void *data, struct
drm_file *file_
case DRM_CAP_CRTC_IN_VBLANK_EVENT: req->value = 1; break;
- case DRM_CAP_RELEASE_FENCE:
req->value = dev->mode_config.release_fence;
break;
Hi Vivek,
is this actually necessary?
I would think that userspace figures out the existence of the release fence capability by seeing that the KMS property "RELEASE_FENCE_PTR" either exists or not.
[Vivek] Yeah, that makes sense. However, in order for the userspace to not see this property, we'd have to prevent drm core from exposing it; which means we need to check dev->mode_config.release_fence before attaching the property to the crtc.
Kernel implementation details, I don't bother with those personally. ;-)
Sounds right.
However, would we not need a client cap instead?
If a KMS driver knows that userspace is aware of "RELEASE_FENCE_PTR" and will use it when necessary, then the KMS driver can send the pageflip completion without waiting for the host OS to signal the old buffer as free for re-use.
[Vivek] Right, the KMS driver can just look at whether the release_fence was added by the userspace (in the atomic commit) to determine whether it needs to wait for the old fb.
You could do it that way, but is it a good idea? I'm not sure.
If the KMS driver does not know that userspace can handle pageflip completing "too early", then it has no choice but to wait until the old buffer is really free before signalling pageflip completion.
Wouldn't that make sense?
[Vivek] Yes; DRM_CAP_RELEASE_FENCE may not be necessary to implement the behavior you suggest which makes sense.
Otherwise, this proposal sounds fine to me.
[Vivek] Did you get a chance to review the Weston MR: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668
Could you please take a look?
Unfortunately I cannot promise any timely feedback on that, I try to concentrate on CM&HDR. However, I'm not the only Weston reviewer, I hope.
Thanks, pq
Thanks, Vivek
Thanks, pq
default: return -EINVAL; } diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h index 12b964540069..944bebf359d7 100644 --- a/include/drm/drm_mode_config.h +++ b/include/drm/drm_mode_config.h @@ -935,6 +935,14 @@ struct drm_mode_config { */ bool normalize_zpos;
- /**
* @release_fence:
*
* If this option is set, it means there would be an additional signalling
* mechanism for a page flip completion.
*/
- bool release_fence;
- /**
- @modifiers_property: Plane property to list support modifier/format
- combination.
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 3b810b53ba8b..8b8985f65581 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -767,6 +767,7 @@ struct drm_gem_open {
- Documentation/gpu/drm-mm.rst, section "DRM Sync Objects".
*/ #define DRM_CAP_SYNCOBJ_TIMELINE 0x14 +#define DRM_CAP_RELEASE_FENCE 0x15
/* DRM_IOCTL_GET_CAP ioctl argument type */ struct drm_get_cap {
Hi Pekka,
Hi Vivek!
On Mon, 13 Sep 2021 16:35:26 -0700 Vivek Kasireddy vivek.kasireddy@intel.com wrote:
If a driver supports this capability, it means that there would be an additional signalling mechanism for a page flip completion in addition to out_fence or DRM_MODE_PAGE_FLIP_EVENT.
This capability may only be relevant for Virtual KMS drivers and is currently used only by virtio-gpu. Also, it can provide a potential solution for: https://gitlab.freedesktop.org/wayland/weston/-/issues/514
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com
drivers/gpu/drm/drm_ioctl.c | 3 +++ include/drm/drm_mode_config.h | 8 ++++++++ include/uapi/drm/drm.h | 1 + 3 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 8b8744dcf691..8a420844f8bc 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -302,6 +302,9 @@ static int drm_getcap(struct drm_device *dev, void *data,
struct
drm_file *file_
case DRM_CAP_CRTC_IN_VBLANK_EVENT: req->value = 1; break;
- case DRM_CAP_RELEASE_FENCE:
req->value = dev->mode_config.release_fence;
break;
Hi Vivek,
is this actually necessary?
I would think that userspace figures out the existence of the release fence capability by seeing that the KMS property "RELEASE_FENCE_PTR" either exists or not.
[Vivek] Yeah, that makes sense. However, in order for the userspace to not see this property, we'd have to prevent drm core from exposing it; which means we need to check dev->mode_config.release_fence before attaching the property to the crtc.
Kernel implementation details, I don't bother with those personally. ;-)
Sounds right.
However, would we not need a client cap instead?
If a KMS driver knows that userspace is aware of "RELEASE_FENCE_PTR" and will use it when necessary, then the KMS driver can send the pageflip completion without waiting for the host OS to signal the old buffer as free for re-use.
[Vivek] Right, the KMS driver can just look at whether the release_fence was added by the userspace (in the atomic commit) to determine whether it needs to wait for the old fb.
You could do it that way, but is it a good idea? I'm not sure.
If the KMS driver does not know that userspace can handle pageflip completing "too early", then it has no choice but to wait until the old buffer is really free before signalling pageflip completion.
Wouldn't that make sense?
[Vivek] Yes; DRM_CAP_RELEASE_FENCE may not be necessary to implement the behavior you suggest which makes sense.
Otherwise, this proposal sounds fine to me.
[Vivek] Did you get a chance to review the Weston MR: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668
Could you please take a look?
Unfortunately I cannot promise any timely feedback on that, I try to concentrate on CM&HDR. However, I'm not the only Weston reviewer, I hope.
[Kasireddy, Vivek] I was going to say it's a small patch to review but, ok np, I'll ping Simon or Michel or Daniel.
Thanks, Vivek
Thanks, pq
Thanks, Vivek
Thanks, pq
default: return -EINVAL; } diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h index 12b964540069..944bebf359d7 100644 --- a/include/drm/drm_mode_config.h +++ b/include/drm/drm_mode_config.h @@ -935,6 +935,14 @@ struct drm_mode_config { */ bool normalize_zpos;
- /**
* @release_fence:
*
* If this option is set, it means there would be an additional signalling
* mechanism for a page flip completion.
*/
- bool release_fence;
- /**
- @modifiers_property: Plane property to list support modifier/format
- combination.
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 3b810b53ba8b..8b8985f65581 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -767,6 +767,7 @@ struct drm_gem_open {
- Documentation/gpu/drm-mm.rst, section "DRM Sync Objects".
*/ #define DRM_CAP_SYNCOBJ_TIMELINE 0x14 +#define DRM_CAP_RELEASE_FENCE 0x15
/* DRM_IOCTL_GET_CAP ioctl argument type */ struct drm_get_cap {
Cc: Gerd Hoffmann kraxel@redhat.com Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com --- drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + drivers/gpu/drm/virtio/virtgpu_fence.c | 9 +++++++++ drivers/gpu/drm/virtio/virtgpu_kms.c | 9 +++++++-- include/uapi/linux/virtio_gpu.h | 2 ++ 6 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c b/drivers/gpu/drm/virtio/virtgpu_debugfs.c index c2b20e0ee030..15d2cb89ff18 100644 --- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c +++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c @@ -52,6 +52,7 @@ static int virtio_gpu_features(struct seq_file *m, void *data) vgdev->has_resource_assign_uuid);
virtio_gpu_add_bool(m, "blob resources", vgdev->has_resource_blob); + virtio_gpu_add_bool(m, "release fence", vgdev->ddev->mode_config.release_fence); virtio_gpu_add_int(m, "cap sets", vgdev->num_capsets); virtio_gpu_add_int(m, "scanouts", vgdev->num_scanouts); if (vgdev->host_visible_region.len) { diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index ed85a7863256..29cb2c355587 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -172,6 +172,7 @@ static unsigned int features[] = { VIRTIO_GPU_F_EDID, VIRTIO_GPU_F_RESOURCE_UUID, VIRTIO_GPU_F_RESOURCE_BLOB, + VIRTIO_GPU_F_RELEASE_FENCE, }; static struct virtio_driver virtio_gpu_driver = { .feature_table = features, diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 0c4810982530..9126bca47c6d 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -140,6 +140,7 @@ struct virtio_gpu_fence { uint64_t fence_id; struct virtio_gpu_fence_driver *drv; struct list_head node; + struct dma_fence *release_fence; };
struct virtio_gpu_vbuffer { diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index d28e25e8409b..d1c36b5fa8ef 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -134,6 +134,9 @@ void virtio_gpu_fence_event_process(struct virtio_gpu_device *vgdev, if (signaled->f.context != curr->f.context) continue;
+ if (curr->release_fence) + continue; + if (!dma_fence_is_later(&signaled->f, &curr->f)) continue;
@@ -142,6 +145,12 @@ void virtio_gpu_fence_event_process(struct virtio_gpu_device *vgdev, dma_fence_put(&curr->f); }
+ if (signaled->release_fence) { + dma_fence_signal(signaled->release_fence); + dma_fence_put(signaled->release_fence); + signaled->release_fence = NULL; + } + dma_fence_signal_locked(&signaled->f); list_del(&signaled->node); dma_fence_put(&signaled->f); diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c index f3379059f324..5706703eb676 100644 --- a/drivers/gpu/drm/virtio/virtgpu_kms.c +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c @@ -156,6 +156,9 @@ int virtio_gpu_init(struct drm_device *dev) if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_RESOURCE_BLOB)) { vgdev->has_resource_blob = true; } + if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_RELEASE_FENCE)) { + vgdev->ddev->mode_config.release_fence = true; + } if (virtio_get_shm_region(vgdev->vdev, &vgdev->host_visible_region, VIRTIO_GPU_SHM_ID_HOST_VISIBLE)) { if (!devm_request_mem_region(&vgdev->vdev->dev, @@ -176,11 +179,13 @@ int virtio_gpu_init(struct drm_device *dev) (unsigned long)vgdev->host_visible_region.len); }
- DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible\n", + DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible \ + %crelease_fence\n", vgdev->has_virgl_3d ? '+' : '-', vgdev->has_edid ? '+' : '-', vgdev->has_resource_blob ? '+' : '-', - vgdev->has_host_visible ? '+' : '-'); + vgdev->has_host_visible ? '+' : '-', + vgdev->ddev->mode_config.release_fence ? '+' : '-');
ret = virtio_find_vqs(vgdev->vdev, 2, vqs, callbacks, names, NULL); if (ret) { diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h index 97523a95781d..9468e17a3c13 100644 --- a/include/uapi/linux/virtio_gpu.h +++ b/include/uapi/linux/virtio_gpu.h @@ -60,6 +60,8 @@ */ #define VIRTIO_GPU_F_RESOURCE_BLOB 3
+#define VIRTIO_GPU_F_RELEASE_FENCE 4 + enum virtio_gpu_ctrl_type { VIRTIO_GPU_UNDEFINED = 0,
Hi,
--- a/include/uapi/linux/virtio_gpu.h +++ b/include/uapi/linux/virtio_gpu.h @@ -60,6 +60,8 @@ */ #define VIRTIO_GPU_F_RESOURCE_BLOB 3
+#define VIRTIO_GPU_F_RELEASE_FENCE 4
enum virtio_gpu_ctrl_type { VIRTIO_GPU_UNDEFINED = 0,
Where is the virtio-spec update for that?
thanks, Gerd
Hi Gerd,
Hi,
--- a/include/uapi/linux/virtio_gpu.h +++ b/include/uapi/linux/virtio_gpu.h @@ -60,6 +60,8 @@ */ #define VIRTIO_GPU_F_RESOURCE_BLOB 3
+#define VIRTIO_GPU_F_RELEASE_FENCE 4
enum virtio_gpu_ctrl_type { VIRTIO_GPU_UNDEFINED = 0,
Where is the virtio-spec update for that?
[Kasireddy, Vivek] I was going to do that if there'd a consensus over DRM_CAP_RELEASE_FENCE. Otherwise, I don't think VIRTIO_GPU_F_RELEASE_FENCE is needed.
Thanks, Vivek
thanks, Gerd
Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com --- drivers/gpu/drm/virtio/virtgpu_drv.h | 4 +++- drivers/gpu/drm/virtio/virtgpu_vq.c | 7 +++++-- 2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 9126bca47c6d..c219ebde27c3 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -406,7 +406,9 @@ virtio_gpu_cmd_set_scanout_blob(struct virtio_gpu_device *vgdev, struct virtio_gpu_object *bo, struct drm_framebuffer *fb, uint32_t width, uint32_t height, - uint32_t x, uint32_t y); + uint32_t x, uint32_t y, + struct virtio_gpu_object_array *objs, + struct virtio_gpu_fence *fence);
/* virtgpu_display.c */ int virtio_gpu_modeset_init(struct virtio_gpu_device *vgdev); diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c index 2e71e91278b4..760e8b8eefb6 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vq.c +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c @@ -1280,7 +1280,9 @@ void virtio_gpu_cmd_set_scanout_blob(struct virtio_gpu_device *vgdev, struct virtio_gpu_object *bo, struct drm_framebuffer *fb, uint32_t width, uint32_t height, - uint32_t x, uint32_t y) + uint32_t x, uint32_t y, + struct virtio_gpu_object_array *objs, + struct virtio_gpu_fence *fence) { uint32_t i; struct virtio_gpu_set_scanout_blob *cmd_p; @@ -1289,6 +1291,7 @@ void virtio_gpu_cmd_set_scanout_blob(struct virtio_gpu_device *vgdev,
cmd_p = virtio_gpu_alloc_cmd(vgdev, &vbuf, sizeof(*cmd_p)); memset(cmd_p, 0, sizeof(*cmd_p)); + vbuf->objs = objs;
cmd_p->hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_SET_SCANOUT_BLOB); cmd_p->resource_id = cpu_to_le32(bo->hw_res_handle); @@ -1308,5 +1311,5 @@ void virtio_gpu_cmd_set_scanout_blob(struct virtio_gpu_device *vgdev, cmd_p->r.x = cpu_to_le32(x); cmd_p->r.y = cpu_to_le32(y);
- virtio_gpu_queue_ctrl_buffer(vgdev, vbuf); + virtio_gpu_queue_fenced_ctrl_buffer(vgdev, vbuf, fence); }
Cc: Gerd Hoffmann kraxel@redhat.com Signed-off-by: Vivek Kasireddy vivek.kasireddy@intel.com --- drivers/gpu/drm/virtio/virtgpu_plane.c | 63 +++++++++++++++++++++++--- 1 file changed, 56 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c index a49fd9480381..ab39254c19e5 100644 --- a/drivers/gpu/drm/virtio/virtgpu_plane.c +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c @@ -27,6 +27,7 @@ #include <drm/drm_damage_helper.h> #include <drm/drm_fourcc.h> #include <drm/drm_plane_helper.h> +#include <drm/drm_vblank.h>
#include "virtgpu_drv.h"
@@ -163,6 +164,60 @@ static void virtio_gpu_resource_flush(struct drm_plane *plane, } }
+static void virtio_gpu_set_scanout_blob(struct drm_plane *plane, + struct virtio_gpu_output *output) +{ + struct drm_device *dev = plane->dev; + struct virtio_gpu_device *vgdev = dev->dev_private; + struct virtio_gpu_framebuffer *vgfb; + struct virtio_gpu_object *bo; + + vgfb = to_virtio_gpu_framebuffer(plane->state->fb); + bo = gem_to_virtio_gpu_obj(vgfb->base.obj[0]); + if (bo->guest_blob && vgdev->ddev->mode_config.release_fence) { + struct drm_crtc_state *crtc_state; + struct drm_pending_event *e; + struct virtio_gpu_object_array *objs; + struct virtio_gpu_fence *fence; + + crtc_state = output->crtc.state; + if (!crtc_state || !crtc_state->event) + return; + + e = &crtc_state->event->base; + if (!e->release_fence) + return; + + fence = virtio_gpu_fence_alloc(vgdev); + if (!fence) + return; + + objs = virtio_gpu_array_alloc(1); + if (!objs) + return; + + fence->release_fence = e->release_fence; + virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]); + virtio_gpu_array_lock_resv(objs); + virtio_gpu_cmd_set_scanout_blob(vgdev, output->index, bo, + plane->state->fb, + plane->state->src_w >> 16, + plane->state->src_h >> 16, + plane->state->src_x >> 16, + plane->state->src_y >> 16, + objs, fence); + } else { + virtio_gpu_cmd_set_scanout_blob(vgdev, output->index, bo, + plane->state->fb, + plane->state->src_w >> 16, + plane->state->src_h >> 16, + plane->state->src_x >> 16, + plane->state->src_y >> 16, + NULL, NULL); + } +} + + static void virtio_gpu_primary_plane_update(struct drm_plane *plane, struct drm_atomic_state *state) { @@ -215,13 +270,7 @@ static void virtio_gpu_primary_plane_update(struct drm_plane *plane, plane->state->src_y >> 16);
if (bo->host3d_blob || bo->guest_blob) { - virtio_gpu_cmd_set_scanout_blob - (vgdev, output->index, bo, - plane->state->fb, - plane->state->src_w >> 16, - plane->state->src_h >> 16, - plane->state->src_x >> 16, - plane->state->src_y >> 16); + virtio_gpu_set_scanout_blob(plane, output); } else { virtio_gpu_cmd_set_scanout(vgdev, output->index, bo->hw_res_handle,
dri-devel@lists.freedesktop.org