Re: [RFC 1/1] drm/pl111: Initial drm/kms driver for pl111 - dri-devel - freedesktop.org experimental mailing list

10 Aug 2013


      On Fri, Aug 9, 2013 at 1:31 PM, Tom Cooksey tom.cooksey@arm.com wrote:
...
...
...
...
So in the above, after X receives the second DRI2SwapBuffers, it
doesn't need to get scheduled again for the next frame to be both
rendered by the GPU and issued to the display for scanout.
well, this is really only an issue if you are so loaded that you
don't get a chance to schedule for ~16ms.. which is pretty long time.
Yes - it really is 16ms (minus interrupt/workqueue latency) isn't it?
Hmmm, that does sound very long. Will try out some experiments and see.
yeah
...
...
...
If you are triple buffering, it should not end up in the critical
path (since the gpu already has the 3rd buffer to start on the next
frame). And, well, if you do it all in the kernel you probably need
to toss things over to a workqueue anyways.
Just a quick comment on the kernel flip queue issue.
16 ms scheduling latency sounds awful but totally doable with a less
than stellar ddx driver going into limbo land and so preventing your
single threaded X from doing more useful stuff. Is this really the
linux scheduler being stupid?
Ahahhaaa!! Yes!!! Really good point. We generally don't have 2D HW and
so rely on pixman to perform all 2D operations which does indeed tie
up that thread for fairly long periods of time.
We've had internal discussions about introducing a thread (gulp) in
the DDX to off-load drawing operations to. I think we were all a bit
scared by that idea though.
thread does sound a bit scary.. it probably could be done if you treat
it like a virtual cpu and have WaitMarker or PrepareAccess for sw
fallbacks synchronize properly..
I bet you'd be much better off just making non-scanout pixmaps cached
and doing cache sync ops when needed for dri2 buffers.  Sw fallbacks
on uncached buffers probably aren't exactly the hot ticket.
...
BTW: I wasn't suggesting it was the linux scheduler being stupid, just
that there is sometimes lots of contention over the CPU cores and X
is just one thread among many wanting to run.
...
At least my impression was that the hw/kernel flip queue is to save
power so that you can queue up a few frames and everything goes to
sleep for half a second or so (at 24fps or whatever movie your
showing). Needing to schedule 5 frames ahead with pageflips under
load is just guaranteed to result in really horrible interactivity
and so awful user experience
Agreed. There's always a tradeoff between tolerance to variable frame
rendering time/system latency (lot of buffers) and UI latency (few
buffers).
As a side note, video playback is one use-case for explicit sync
objects which implicit/buffer-based sync doesn't handle: Queue up lots
of video frames for display, but mark those "display buffer"
operations as depending on explicit sync objects which get signalled
by the audio clock. Not sure Android actually does that yet though.
Anyway, off topic.
w/ dmafence, rather than explicit fences, I suppose you could add some
way to queue the buffer to the audio device and have the audio device
signal the fence.  I suppose it does sound a bit funny for ALSA to
have a DMA_BUF_AV_SYNC ioctl for this sort of case?
I don't think there is anything like it in EGL, but there is
oml_sync_control extension for more precise control of presentation
time.  But this is all implemented in userspace and doesn't really
work out w/ >double buffering.  This is part of the reason for the
timing information in vblank events.  Of course it doesn't have any
tie in to audio subsystem, but in practice this really shouldn't be
needed.  Audio samples are either rendered at a very predictable rate,
or sound like sh** with lots of pops and cut outs.
BR,
-R
...
Cheers,
Tom