Lucas Stach l.stach@pengutronix.de writes:
Am Mittwoch, den 27.06.2018, 10:25 -0700 schrieb Eric Anholt:
Lucas Stach l.stach@pengutronix.de writes:
When the hangcheck handler was replaced by the DRM scheduler timeout handling we dropped the forward progress check, as this might allow clients to hog the GPU for a long time with a big job.
It turns out that even reasonably well behaved clients like the Armada Xorg driver occasionally trip over the 500ms timeout. Bring back the forward progress check to get rid of the userspace regression.
We would still like to fix userspace to submit smaller batches if possible, but that is for another day.
Fixes: 6d7a20c07760 (drm/etnaviv: replace hangcheck with scheduler timeout) Signed-off-by: Lucas Stach l.stach@pengutronix.de
I was just wondering if there was a way to do this with the scheduler (I had a similar issue with GTF-GLES2.gtf.GL.acos.acos_float_vert_xvary), and this looks correct.
What are you thinking about? A forward progress check at sub-fence granularity is always going to be GPU specific. The only thing that could be shunted to the scheduler is rearming of the timer. We could do this by changing the return type of timedout_job to something that allows us to indicate a false-positive to the scheduler.
That sounds like a nice future improvement.