On Tue, Aug 02, 2016 at 03:21:08PM +0200, Enrico Weigelt, metux IT consult wrote:
Hi folks,
I'm currently thinking about adding an hw-accelerated bitblt operation. The idea goes like this:
- we add some bitblt ioctl which copies rects between bo's. (it also handles memory layouts, pixfmt conversion, etc)
- the driver can decide to let the GPU or IPU do that, if available
- if we have an suitable DMA engine (maybe only the more complex ones which can handle lines on their own ...) we'll use that
- as fallback, resort to memcpy().
Whether an dma engine can/should be used might be highly hw specific, so that probably would be configured in DT.
To use that feature, userland could actually allocate two BO's, one that's mapped as a framebuffer to some crtc, another one just a memory buffer. It could then render to the fast memory buffer and tell the DRM to only copy over the changed regions to the graphics memory via DMA (or whatever is best on that particular hw platform).
What do you think about that idea ?
If you mean "add a generic hw-accelerated bitblt operation": This is not hw drm works. The generic kms stuff is about display only, with just very basic (hence "dumb") buffer allocation support in a generic way.
If you mean "expose the dma engine I have here to userspace in driver-private ioctls with the trade-off logic between that, kms compositing using the display block and memcpy in userspace", then go ahead ;-) But if you do that, pls don't don't forget that for any uapi the drm subsytem requires correspoding open source userspace (in a real app/compositor, not just some toy test or something similar).
Cheers, Daniel