Hi, Steven!
Thanks for such a detailed explanation of T628 peculiarities.
On Wed, Jan 12, 2022 at 05:03:15PM +0000, Steven Price wrote:
On 10/01/2022 17:42, Alyssa Rosenzweig wrote:
Whether it's worth the effort depends on whether anyone really cares about getting the full performance out of this particular GPU.
At this stage I think the main UABI change would be to add the opposite flag to kbase, (e.g. "PANFROST_JD_DOESNT_NEED_COHERENCY_ON_GPU"[1]) to opt-in to allowing the job to run across all cores.
The second change would be to allow compute jobs to be run on the second core group, so another flag: PANFROST_RUN_ON_SECOND_CORE_GROUP.
But clearly there's little point adding such flags until someone steps up to do the Mesa work.
I worry about the maintainence burden (both Mesa and kernel) of adding UABI only used by a piece of hardware none of us own, and only useful "sometimes" for that hardware. Doubly so for the second core group support; currently Mesa doesn't advertise any compute support on anything older than Mali T760 ... to the best of my knowledge, nobody has missed that support either...
I agree there's no point adding the UABI support unless someone is willing to step and be a maintainer for that hardware. And I suspect no one cares enough about that hardware to do that.
To be clear I am in favour of merging the patches needed for GLES2 to work on all Malis, possibly at a performance cost on these dual-core systems. That's a far cry from the level of support the DDK gave these chips back in the day ... of course, the DDK doesn't support them at all anymore, so Panfrost wins there by default! ;)
Agreed - I'm happy to merge a kernel series similar to this. I think the remaining problems are:
- Addressing Robin's concerns about the first patch. That looks like
it's probably just wrong.
The first patch is wrong and I'll drop it.
- I think this patch is too complex for the basic support. There's some
parts like checking GROUPS_L2_COHERENT which also don't feature in kbase
That part has been adapted from kbase_gpuprops_construct_coherent_groups, see https://github.com/hardkernel/linux/blob/2f0f4268209ddacc2cdea158104b87cedac...
so I don't believe are correct.
I'm not sure if it's correct or not, however - it does not change anything for GPUs with coherent L2 caches - it appears to correctly figure out core groups for several SoCs with T628 GPU (BE-M1000, Exynos 5422).
- I don't think this blocks the change. But if we're not using the
second core group we could actually power it down. Indeed simply not turning on the L2/shader cores should in theory work (jobs are not scheduled to cores which are turned off even if they are included in the affinity).
Powering off unused GPU is would be nice, however
1) the userspace might power on those cores again (via sysfs or something), so I prefer to explicitly schedule jobs to the core group 0.
2) on BE-M1000 GPU seems to lock up in a few seconds after powering off some (GPU) cores. In fact I had to disable GPU devfreq to prevent GPU lockups.
Therefore I consider powering off unused cores as a later optimization. (frankly speaking I'd better put the effort to *making use* of those cores instead of figuring out why they fail to power down properly).
Best regards, Alexey