https://bugs.freedesktop.org/show_bug.cgi?id=102322
--- Comment #41 from dwagner jb5sgc1n.nya@20mm.eu --- (In reply to Andrey Grodzovsky from comment #40)
Created attachment 141112 [details] .config
I uploaded my .config file - maybe something in your Kconfig flags makes this happen - you can try and rebuild latest kernel from Alex's repository using my .config and see if you don't experience this anymore. https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next
Did just that - but still the video test crashes after at most few minutes, and does not crash with DPM turned off. So we can rule out our .config differences (of which there are many).
Other than that, since you system hard hangs so you can't do any postmortem dumps, you can at least provide output from events tracing though trace_pipe to catch live logs on the fly. Maybe we can infer something from there...
So again - Load the system and before starting reproduce run the following trace command -
sudo trace-cmd start -e dma_fence -e gpu_scheduler -e amdgpu -v -e "amdgpu:amdgpu_mm_rreg" -e "amdgpu:amdgpu_mm_wreg" -e "amdgpu:amdgpu_iv"
then cd /sys/kernel/debug/tracing && cat trace_pipe
When the problem happens just copy all the output from the terminal to a log file. Make sure your terminal app has largest possible buffer to catch ALL the output.
Will try that on next opportunity, probably tomorrow evening.