-----Original Message----- From: Daniel Vetter daniel@ffwll.ch Sent: Friday, July 30, 2021 6:26 PM To: Kasireddy, Vivek vivek.kasireddy@intel.com Cc: dri-devel@lists.freedesktop.org; Daniel Vetter daniel@ffwll.ch; Gerd Hoffmann kraxel@redhat.com; Pekka Paalanen ppaalanen@gmail.com; Simon Ser contact@emersion.fr; Michel Dänzer michel@daenzer.net; Zhang, Tina tina.zhang@intel.com; Kim, Dongwon dongwon.kim@intel.com Subject: Re: [RFC v1 0/4] drm: Add support for DRM_CAP_DEFERRED_OUT_FENCE capability
On Thu, Jul 29, 2021 at 01:16:55AM -0700, Vivek Kasireddy wrote:
By separating the OUT_FENCE signalling from pageflip completion allows a Guest compositor to start a new repaint cycle with a new buffer instead of waiting for the old buffer to be free.
This work is based on the idea/suggestion from Simon and Pekka.
This capability can be a solution for this issue: https://gitlab.freedesktop.org/wayland/weston/-/issues/514
Corresponding Weston MR: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/668
Uh I kinda wanted to discuss this a bit more before we jump into typing code, but well I guess not that much work yet.
So maybe I'm not understanding the problem, but I think the fundamental underlying issue is that with KMS you can have at most 2 buffers in-flight, due to our queue depth limit of 1 pending flip.
Unfortunately that means for virtual hw where it takes a few more steps/vblanks until the framebuffer actually shows up on screen and is scanned out, we suffer deeply. The usual fix for that is to drop the latency and increase throughput, and have more buffers in-flight. Which this patch tries to do.
Now I think where we go wrong here is that we're trying to hack this up by defining different semantics for the out-fence and for the drm-event. Imo that's wrong, they're both meant to show eactly the same thing:
- when is the new frame actually visible to the user (as in, eyeballs in a human head, preferrably, not the time when we've handed the buffer off to the virtual hw)
- when is the previous buffer no longer being used by the scanout hw
We do cheat a bit right now in so far that we assume they're both the same, as in, panel-side latency is currently the compositor's problem to figure out.
So for virtual hw I think the timestamp and even completion really need to happen only when the buffer has been pushed through the entire virtualization chain, i.e. ideally we get the timestamp from the kms driver from the host side. Currently that's not done, so this is most likely quite broken already (virtio relies on the no-vblank auto event sending, which definitely doesn't wait for anything, or I'm completely missing something).
Agree. One lesson we got from previous direct-display related work is that using host hardware event is kind of "must". Otherwise, problems like flickering or tearing or frame drop will come out. Besides, as the wayland-ui is working as a weston client, it needs to have more than 2 buffers to support the full-frame redraw. I tried the Weston-simple-dmabuf-egl with 2 buffers and it was bad.
BR, Tina
I think instead of hacking up some ill-defined 1.5 queue depth support, what we should do is support queue depth > 1 properly. So:
Change atomic to support queue depth > 1, this needs to be a per-driver thing due to a bunch of issues in driver code. Essentially drivers must never look at obj->state pointers, and only ever look up state through the passed-in drm_atomic_state * update container.
Aside: virtio should loose all it's empty hooks, there's no point in that.
We fix virtio to send out the completion event at the end of this entire pipeline, i.e. virtio code needs to take care of sending out the crtc_state->event correctly.
We probably also want some kind of (maybe per-crtc) recommended queue depth property so compositors know how many buffers to keep in flight. Not sure about that.
It's a bit more work, but also a lot less hacking around infrastructure in dubious ways.
Thoughts?
Cheers, Daniel
Cc: Daniel Vetter daniel@ffwll.ch Cc: Gerd Hoffmann kraxel@redhat.com Cc: Pekka Paalanen ppaalanen@gmail.com Cc: Simon Ser contact@emersion.fr Cc: Michel Dänzer michel@daenzer.net Cc: Tina Zhang tina.zhang@intel.com Cc: Dongwon Kim dongwon.kim@intel.com
Vivek Kasireddy (4): drm: Add a capability flag to support deferred out_fence signalling virtio-gpu uapi: Add VIRTIO_GPU_F_OUT_FENCE feature drm/virtio: Add VIRTIO_GPU_CMD_RESOURCE_OUT_FENCE cmd drm/virtio: Probe and implement VIRTIO_GPU_F_OUT_FENCE feature
drivers/gpu/drm/drm_file.c | 11 +++--- drivers/gpu/drm/drm_ioctl.c | 3 ++ drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.h | 6 ++++ drivers/gpu/drm/virtio/virtgpu_fence.c | 9 +++++ drivers/gpu/drm/virtio/virtgpu_kms.c | 10 ++++-- drivers/gpu/drm/virtio/virtgpu_plane.c | 44 +++++++++++++++++++++++- drivers/gpu/drm/virtio/virtgpu_vq.c | 17 +++++++++ include/drm/drm_mode_config.h | 9 +++++ include/uapi/drm/drm.h | 1 + include/uapi/linux/virtio_gpu.h | 12 +++++++ 12 files changed, 117 insertions(+), 7 deletions(-)
-- 2.30.2
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch