https://bugs.freedesktop.org/show_bug.cgi?id=99312
Vedran Miletić vedran@miletic.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|Long-running OpenCL kernels |Long-running OpenCL kernels |cause ring stalls and GPU |cause ring stalls and GPU |lockups on Kabini |lockups on Kabini when | |radeon.lockup_timeout is | |enabled
--- Comment #2 from Vedran Miletić vedran@miletic.net --- (In reply to John Bridgman from comment #1)
If you have not already done so, try disabling the watchdog timer:
MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default 10000 = 10 seconds, 0 = disable)"); module_param_named(lockup_timeout, radeon_lockup_timeout, int, 0444);
Yup, that works around the problem.
As part of HSA/ROC development we dropped the priority of compute work relative to graphics which improved interactivity and *almost* eliminated timeouts without having to disable the timer - when I get back in the office I'll dig up the changes. In the meantime, I think disabling the timer will do what you need although you will still have sluggish graphics while long-running kernels are active.
Eager to hear the details.