Lucas Stach l.stach@pengutronix.de writes:
When the hangcheck handler was replaced by the DRM scheduler timeout handling we dropped the forward progress check, as this might allow clients to hog the GPU for a long time with a big job.
It turns out that even reasonably well behaved clients like the Armada Xorg driver occasionally trip over the 500ms timeout. Bring back the forward progress check to get rid of the userspace regression.
We would still like to fix userspace to submit smaller batches if possible, but that is for another day.
Fixes: 6d7a20c07760 (drm/etnaviv: replace hangcheck with scheduler timeout) Signed-off-by: Lucas Stach l.stach@pengutronix.de
I was just wondering if there was a way to do this with the scheduler (I had a similar issue with GTF-GLES2.gtf.GL.acos.acos_float_vert_xvary), and this looks correct.
As far as I can see, the fence_completed check shouldn't be necessary, since you'll get a cancel_delayed_work_sync() once the job finish happens, so you're only really protecting from a timeout not detecting progress in between fence signal and job finish, but we expect job finish to be quick.
Regardless,
Reviewed-by: Eric Anholt eric@anholt.net