On Thu, Dec 21, 2017 at 12:22 AM, Kristian Kristensen hoegsberg@google.com wrote:
On Wed, Dec 20, 2017 at 12:41 PM, Miguel Angel Vico mvicomoya@nvidia.com wrote:
Inline.
On Wed, 20 Dec 2017 11:54:10 -0800 Kristian Høgsberg hoegsberg@gmail.com wrote:
On Wed, Dec 20, 2017 at 11:51 AM, Daniel Vetter daniel@ffwll.ch wrote:
Since this also involves the kernel let's add dri-devel ...
Yeah, I forgot. Thanks Daniel!
On Wed, Dec 20, 2017 at 5:51 PM, Miguel Angel Vico mvicomoya@nvidia.com wrote:
Hi all,
As many of you already know, I've been working with James Jones on the Generic Device Allocator project lately. He started a discussion thread some weeks ago seeking feedback on the current prototype of the library and advice on how to move all this forward, from a prototype stage to production. For further reference, see:
https://lists.freedesktop.org/archives/mesa-dev/2017-November/177632.html
From the thread above, we came up with very interesting high level design ideas for one of the currently missing parts in the library: Usage transitions. That's something I'll personally work on during the following weeks.
In the meantime, I've been working on putting together an open source implementation of the allocator mechanisms using the Nouveau driver for all to be able to play with.
Below I'm seeking feedback on a bunch of changes I had to make to different components of the graphics stack:
** Allocator **
An allocator driver implementation on top of Nouveau. The current implementation only handles pitch linear layouts, but that's enough to have the kmscube port working using the allocator and Nouveau drivers.
You can pull these changes from
https://github.com/mvicomoya/allocator/tree/wip/mvicomoya/nouveau-driver
** Mesa **
James's kmscube port to use the allocator relies on the EXT_external_objects extension to import allocator allocations to OpenGL as a texture object. However, the Nouveau implementation of these mechanisms is missing in Mesa, so I went ahead and added them.
You can pull these changes from
https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/EXT_external_objects-no...
Also, James's kmscube port uses the NVX_unix_allocator_import extension to attach allocator metadata to texture objects so the driver knows how to deal with the imported memory.
Note that there isn't a formal spec for this extension yet. For now, it just serves as an experimental mechanism to import allocator memory in OpenGL, and attach metadata to texture objects.
You can pull these changes (written on top of the above) from:
https://github.com/mvicomoya/mesa/tree/wip/mvicomoya/NVX_unix_allocator_impo...
** kmscube **
Mostly minor fixes and improvements on top of James's port to use the allocator. Main thing is the allocator initialization path will use EGL_MESA_platform_surfaceless if EGLDevice platform isn't supported by the underlying EGL implementation.
You can pull these changes from:
https://github.com/mvicomoya/kmscube/tree/wip/mvicomoya/allocator-nouveau
With all the above you should be able to get kmscube working using the allocator on top of the Nouveau driver.
Another of the missing pieces before we can move this to production is importing allocations to DRM FB objects. This is probably one of the most sensitive parts of the project as it requires modification/addition of kernel driver interfaces.
At XDC2017, James had several hallway conversations with several people about this, all having different opinions. I'd like to take this opportunity to also start a discussion about what's the best option to create a path to get allocator allocations added as DRM FB objects.
These are the few options we've considered to start with:
A) Have vendor-private ioctls to set properties on GEM objects that are inherited by the FB objects. This is how our (NVIDIA) desktop DRM driver currently works. This would require every vendor to add their own ioctl to process allocator metadata, but the metadata is actually a vendor-agnostic object more like DRM modifiers. We'd like to come up with a vendor-agnostic solutions that can be integrated to core DRM.
B) Add a new drmModeAddFBWithMetadata() command that takes allocator metadata blobs for each plane of the FB. Some people in the community have mentioned this is their preferred design. This, however, means we'd have to go through the exercise of adding another metadata mechanism to the whole graphics stack.
C) Shove allocator metadata into DRM by defining it to be a separate plane in the image, and using the existing DRM modifiers mechanism to indicate there is another plane for each "real" plane added. It isn't clear how this scales to surfaces that already need several planes, but there are some people that see this as the only way forward. Also, we would have to create a separate GEM buffer for the metadatada itself, which seems excessive.
We personally like option (B) better, and have already started to prototype the new path (which is actually very similar to the drmModeAddFB2() one). You can take a look at the new interfaces here:
https://github.com/mvicomoya/linux/tree/wip/mvicomoya/drm_addfb_with_metadat...
There may be other options that haven't been explored yet that could be a better choice than the above, so any suggestion will be greatly appreciated.
What kind of metadata are we talking about here? Addfb has tons of stuff already that's "metadata". The only thing I've spotted is PITCH_ALIGNMENT, which is maybe something we want drm drivers to tell userspace, but definitely not something addfb ever needs. addfb only needs the resulting pitch that we actually allocated (and might decide it doesn't like that, but that's a different issue).
Sorry I failed to make it clearer. Metadata here refers to all allocation parameters the generic allocator was given to allocate memory. That currently means the final capability set used for the allocation, including all constraints (such as memory alignment, pitch alignment, and others) and capabilities, describing allocation properties like tiling formats, compression, and such.
Yeah, that part was all clear. I'd want more details of what exact kind of metadata. fast-clear colors? tiling layouts? aux data for the compressor? hiz (or whatever you folks call it) tree?
As you say, we've discussed massive amounts of different variants on this, and there's different answers for different questions. Consensus seems to be that bigger stuff (compression data, hiz, clear colors, ...) should be stored in aux planes, while the exact layout and what kind of aux planes you have are encoded in the modifier.
And since there's no patches for nouveau itself I can't really say anything beyond that.
I can work on implementing these interfaces for nouveau, maybe partially, if that's going to help. I just thought it'd be better to first start a discussion on what would be the right way to pass allocator metadata to display drivers before starting to seriously implement any of the proposed options.
It's not so much wiring down the interfaces, but actually implementing the features. "We need more than the 56bits of modifier" is a lot more plausible when you have the full stack showing that you do actually need it. Or well, not a full stack but at least a demo that shows what you want to pull of but can't do right now.
I'd like to see concrete examples of actual display controllers supporting more format layouts than what can be specified with a 64 bit modifier.
The main problem is our tiling and other metadata parameters can't generally fit in a modifier, so we find passing a blob of metadata a more suitable mechanism.
I understand that you may have n knobs with a total of more than a total of 56 bits that configure your tiling/swizzling for color buffers. What I don't buy is that you need all those combinations when passing buffers around between codecs, cameras and display controllers. Even if you're sharing between the same 3D drivers in different processes, I expect just locking down, say, 64 different combinations (you can add more over time) and assigning each a modifier would be sufficient. I doubt you'd extract meaningful performance gains from going all the way to a blob.
Tegra just redesigned it's modifier space from an ungodly amount of bits to just a few layouts. Not even just the ones in used, but simply limiting to the ones that make sense (there's dependencies apparently) Also note that the modifier alone doesn't need to describe the layout precisely, it only makes sense together with a specific pixel format and size. E.g. a bunch of the i915 layouts change layout depending upon bpp.
If you want us the redesign KMS and the rest of the eco system around blobs instead of the modifiers that are now moderately pervasive, you have to justify it a little better than just "we didn't find it suitable".
Given that this involves the kernel and hence the kernel's userspace requirements for merging stuff (assuming of course you want to establish this as an upstream interface), then I'd say a sufficient demonstration would be actually running out of bits in nouveau (kernel+mesa). -Daniel