I'm surprised this made such a big difference. It should not take a long time to flush the L2T -- no memory access should be involved, it should just be a matter of the L2T iterating over its lines clearing the tags. This should take thousands of V3D cycles, not milliseconds. So the 3-4ms stall seems worthy of more investigation to me.
A comment describing why no waits are necessary would be good.
On Sat, 1 Dec 2018 at 00:58, Eric Anholt eric@anholt.net wrote:
According to Dave, once you've started an L2T flush, all L2T accesses will be blocked until the flush completes. This fixes a consistent 3-4ms stall between the ioctl and running the job, and 3DMMES Taiji goes from 27fps to 110fps.
Signed-off-by: Eric Anholt eric@anholt.net
Reviewed-by: Dave Emett david.emett@broadcom.com
Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
drivers/gpu/drm/v3d/v3d_gem.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index cc4d025b01e0..0bd6892e3044 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -146,10 +146,6 @@ v3d_flush_l2t(struct v3d_dev *v3d, int core) V3D_CORE_WRITE(core, V3D_CTL_L2TCACTL, V3D_L2TCACTL_L2TFLS | V3D_SET_FIELD(V3D_L2TCACTL_FLM_FLUSH, V3D_L2TCACTL_FLM));
if (wait_for(!(V3D_CORE_READ(core, V3D_CTL_L2TCACTL) &
V3D_L2TCACTL_L2TFLS), 100)) {
DRM_ERROR("Timeout waiting for L2T flush\n");
}
}
/* Invalidates the slice caches. These are read-only caches. */
2.20.0.rc1