Am 11.05.21 um 09:31 schrieb Daniel Vetter:
[SNIP]
And that's just the one ioctl I know is big trouble, I'm sure we'll find more funny corner cases when we roll out explicit user fencing.
I think we can just ignore sync_file. As far as it concerns me that UAPI is pretty much dead.
Uh that's rather bold. Android is built on it. Currently atomic kms is built on it.
To be honest I don't think we care about Android at all.
What we should support is drm_syncobj, but that also only as an in-fence since that's what our hardware supports.
Convince Android folks, minimally. Probably a lot more. Yes with hindsight we should have just gone for drm_syncobj instead of the sync_file thing, but hindsight and all that.
This is kinda why I don't think trying to support the existing uapi with userspace fences underneath with some magic tricks is a good idea. It's just a pile of work, plus it's not really architecturally clean.
Anotherone that looks very sketchy right now is buffer sharing between different userspace drivers, like compute <-> media (if you have some fancy AI pipeline in your media workload, as an example).
Yeah, we are certainly going to get that. But only inside the same driver, so not much of a problem.
Why is this not much of a problem if it's just within one driver?
Because inside the same driver I can easily add the waits before submitting the MM work as necessary.
Adding implicit synchronization on top of that is then rather trivial.
Well that's what I disagree with, since I already see some problems that I don't think we can overcome (the atomic ioctl is one). And that's with us only having a fairly theoretical understanding of the overall situation.
But how should we then ever support user fences with the atomic IOCTL?
We can't wait in user space since that will disable the support for waiting in the hardware.
Well, figure it out :-)
This is exactly why I'm not seeing anything solved with just rolling a function call to a bunch of places, because it's pretending all things are solved when clearly that's not the case.
I really think what we need is to first figure out how to support userspace fences as explicit entities across the stack, maybe with something like this order:
- enable them purely within a single userspace driver (like vk with
winsys disabled, or something else like that except not amd because there's this amdkfd split for "real" compute) 1a. including atomic ioctl, e.g. for vk direct display support this can be used without cross-process sharing, new winsys protocols and all that fun 2. figure out how to transport these userspace fences with something like drm_syncobj 2a. figure out the compat story for drivers which dont do userspace fences 2b. figure out how to absorb the overhead if the winsys/compositor doesn't support explicit sync 3. maybe figure out how to make this all happen magically with implicit sync, if we really, really care
If we do 3 before we've nailed all these problems, we're just guaranteeing we'll get the wrong solutions and so we'll then have 3 ways of doing userspace fences
- the butchered implicit one that didn't quite work
- the explicit one
- the not-so-butchered implicit one with the lessons from the properly done explicit one
The thing is, if you have no idea how to integrate userspace fences explicitly into atomic ioctl, then you definitely have no idea how to do it implicitly :-)
Well I agree on that. But the question is still how would you do explicit with atomic?
Transporting fences between processes is not the fundamental problem here, but rather the question how we represent all this in the kernel?
In other words I think what you outlined above is just approaching it from the wrong side again. Instead of looking what the kernel needs to support this you take a look at userspace and the requirements there.
Regards, Christian.
And "just block" might be good enough for a quick demo, it still breaks the contract. Same holds for a bunch of the winsys problems we'll have to deal with here. -Daniel
Regards, Christian.
Like here at intel we have internal code for compute, and we're starting to hit some interesting cases with interop with media already, but that's it. Nothing even close to desktop/winsys/kms, and that's where I expect will all the pain be at.
Cheers, Daniel