On 18/08/16 08:51 AM, Mario Kleiner wrote:
That's what the ati-ddx/amdgpu-ddx does at the moment, as it detects the mismatch in tiling flags and uses the DRI3/Present copy path instead of the pageflip path. The problem is that the servers Present implementation doesn't request a vsync'ed start of the copy operation [...]
It waits for vblank before starting the copy.
There is this other approach from NVidia's Alex Goins for their proprietary driver, whose patches landed in the X-Server 1.19 master branch a couple of weeks ago. I haven't read his patches in detail yet, and i so far couldn't successfully test them with the reference implementation in modesetting ddx 1.19. Afaik there the display gpu exports a pair of scanout friendly, page flipping compatible dmabufs (i assume linear, contiguous, accessible by the display engines),
FWIW, that wouldn't be possible with our "older" GPUs which can't scan out from GTT: A BO can be either shared with another GPU or scanout friendly, not both at the same time.
and the offload gpu imports those and renders into them. That saves one extra copy, so should be somewhat more efficient.
Using two shared buffers actually isn't as efficient as possible wrt inter-GPU bandwidth.
Setting it up seems to be more involved and less flexible though. So far i couldn't make it work here for testing. Maybe bugs, maybe mistakes on my side, maybe i just have the wrong hardware for it.
Yeah, my impression has been it's a rather complicated solution geared towards the Intel iGPU + proprietary nVidia use case.