https://bugs.freedesktop.org/show_bug.cgi?id=107928
--- Comment #4 from dwagner jb5sgc1n.nya@20mm.eu --- @ Vik-T: The way you describe your symptoms let it seem possible to me that you are experiencing the very same long-standing bug that I reported in https://bugs.freedesktop.org/show_bug.cgi?id=102322
If you want to verify if that bug and yours are actually the same, you could try the following:
(a) Check whether you experience your bug also after disabling dynamic power management. To do this, switch to manual power management like this:
cd /sys/class/drm/card0/device echo manual >power_dpm_force_performance_level echo 0 >pp_dpm_mclk echo 0 >pp_dpm_sclk
In my case, the bug does not occur while clocks are set manually. Cave: These settings are ignored/overwritten by the amdgpu driver after each display mode change and each off/on of display output or monitor. So this test has meaning only if manual settings are re-activated after each such display mode change / on-switching. (This bug I reported with https://bugs.freedesktop.org/show_bug.cgi?id=107141 )
(b) You could check if you can reproduce the symptom more quickly with a certain load pattern:
(1) Enable dynamic power management (which is also the default) (2) Start X11, but not any client (or desktop environment) that draws anything on the screen (3) Replay an (at least 1080p) video with only 3 frames per second, e.g. via: "mpv --no-correct-pts --fps=3 --ao=null some_arbitrary_video.webm"
This kind of load causes (at least in the case of my system) frequent changes to the pp_dpm_mclk and pp_dpm_sclk values, and the system crashes after only a short while (seconds up to 15 minutes) under this kind of load, with the symptom (blanked screen, system crash) you described.