On Tue, Feb 23, 2021 at 1:42 PM Neil Roberts nroberts@igalia.com wrote:
Daniel Vetter daniel@ffwll.ch writes:
Yeah plus Cc: stable for backporting and I think an igt or similar for panfrost to check this works correctly would be pretty good too. Since if it took us over 1 year to notice this bug it's pretty clear that normal testing doesn't catch this. So very likely we'll break this again.
I made the IGT test below which seems to reproduce the bug. However, the kernel patch doesn’t fix it, so maybe there is something more subtle going on.
https://gitlab.freedesktop.org/nroberts/igt-gpu-tools/-/commits/panfrost-pur...
drm_gem_shmem_fault() does not seem to check for purged objects at all.
No idea how this works, or if it ever worked, but yeah something is clearly still busted.
Definitely a good idae to have an igt. btw to make that faster you can either use the vm_drop_caches file from proc (it's a bit a hammer), or what I recommend: Have a dedicated debugfs file to only drop everything from your shrinker. That's much quicker and controlled. See e.g. ttm_tt_debugfs_shrink from d4bd7776a7ac ("drm/ttm: rework ttm_tt page limit v4") which recently landed in drm-misc-next. -Daniel