https://bugs.freedesktop.org/show_bug.cgi?id=101528
Bug ID: 101528 Summary: RX460 Memory clock stays high until card is "used" Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: sverd.johnsen@googlemail.com
linux 4.11.4
I use the IGPU on SKL and a RX460 as a dedicated card. In XOrg i turned autobindgpu, autoaddgpu off and singlecard on. with fb fbcon=map kernel parameter i map the linux ttys to different cards. the way i switch to the AMD card is to vt switch and then change the input on my monitor. this works fine and lets me login via agetty and launch a new xserver. One thing i noticed is that until i actually do that the memory clock of the card stays high (17xx mhz?) which makes it 5-8°C hotter and probably uses more power than it needs.
https://bugs.freedesktop.org/show_bug.cgi?id=101528
--- Comment #1 from Alexander Tsoy alexander@tsoy.me --- Same problem with TONGA. When GPU is idle, mclk goes to its maximum. This is easily reproducible: just turn off the monitor. I noticed this issue after upgrade from 4.9.x kernels to 4.11.7. Maybe I'll check the mainline kernel later and/or bisect.
# cat /sys/class/drm/card0/device/pp_dpm_mclk 0: 150Mhz 1: 300Mhz 2: 700Mhz 3: 1450Mhz * # sensors amdgpu-* amdgpu-pci-0100 Adapter: PCI adapter fan1: 2025 RPM temp1: +47.0°C (crit = +0.0°C, hyst = +0.0°C)
https://bugs.freedesktop.org/show_bug.cgi?id=101528
--- Comment #2 from Alexander Tsoy alexander@tsoy.me --- I've added printing of some debug info into smu7_hwmgr.c and here what I get before GPU enters that state:
[ 778.701843] AMDGPU: vblank_time_us: 630, switch_limit_us: 450 [ 778.707608] AMDGPU: vblank_time_us: 630, switch_limit_us: 450 [ 778.713379] AMDGPU: disable_mclk_switching: 0, disable_mclk_switching_for_frame_lock: 0, info.display_count: 1, smu7_vblank_too_short: 0, mode_info.refresh_rate: 60 [ 778.748777] AMDGPU: vblank_time_us: 0, switch_limit_us: 450 [ 778.754361] AMDGPU: vblank_time_us: 0, switch_limit_us: 450 [ 778.759951] AMDGPU: disable_mclk_switching: 1, disable_mclk_switching_for_frame_lock: 0, info.display_count: 1, smu7_vblank_too_short: 1, mode_info.refresh_rate: 0
For some reason if refresh_rate = 0 then vblank_time_us = 0. Shouldn't the latter be 0xffffffff instead? So I guess the following commit is the culprit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
and the following patch should fix (or workaround?) this issue:
- if (vblank_time_us < switch_limit_us) + if (vblank_time_us && (vblank_time_us < switch_limit_us))
After applying it:
[ 409.588673] AMDGPU: vblank_time_us: 630, switch_limit_us: 450 [ 409.594427] AMDGPU: vblank_time_us: 630, switch_limit_us: 450 [ 409.600182] AMDGPU: disable_mclk_switching: 0, disable_mclk_switching_for_frame_lock: 0, info.display_count: 1, smu7_vblank_too_short: 0, mode_info.refresh_rate: 60 [ 409.639750] AMDGPU: vblank_time_us: 0, switch_limit_us: 450 [ 409.645321] AMDGPU: vblank_time_us: 0, switch_limit_us: 450 [ 409.650917] AMDGPU: disable_mclk_switching: 0, disable_mclk_switching_for_frame_lock: 0, info.display_count: 1, smu7_vblank_too_short: 0, mode_info.refresh_rate: 0
$ cat /sys/class/drm/card0/device/pp_dpm_mclk 0: 150Mhz * 1: 300Mhz 2: 700Mhz 3: 1450Mhz
https://bugs.freedesktop.org/show_bug.cgi?id=101528
--- Comment #3 from Alex Deucher alexdeucher@gmail.com --- Created attachment 132358 --> https://bugs.freedesktop.org/attachment.cgi?id=132358&action=edit possible fix
https://bugs.freedesktop.org/show_bug.cgi?id=101528
--- Comment #4 from Alexander Tsoy alexander@tsoy.me --- (In reply to Alex Deucher from comment #3)
Created attachment 132358 [details] [review] possible fix
This patch fixes this bug for me. Thank you!
[ 359.229187] AMDGPU: vblank_time_us: 630, switch_limit_us: 450 [ 359.234933] AMDGPU: vblank_time_us: 630, switch_limit_us: 450 [ 359.240684] AMDGPU: disable_mclk_switching: 0, disable_mclk_switching_for_frame_lock: 0, info.display_count: 1, smu7_vblank_too_short: 0, mode_info.refresh_rate: 60 [ 359.283987] AMDGPU: vblank_time_us: 4294967295, switch_limit_us: 450 [ 359.290342] AMDGPU: vblank_time_us: 4294967295, switch_limit_us: 450 [ 359.296703] AMDGPU: disable_mclk_switching: 0, disable_mclk_switching_for_frame_lock: 0, info.display_count: 1, smu7_vblank_too_short: 0, mode_info.refresh_rate: 0 ... ...
https://bugs.freedesktop.org/show_bug.cgi?id=101528
--- Comment #5 from Sverd Johnsen sverd.johnsen@googlemail.com --- Works for me on 4.11.10. Display off, MCLK is low and card temperature is 27°C as expected. Thanks.
https://bugs.freedesktop.org/show_bug.cgi?id=101528
Sverd Johnsen sverd.johnsen@googlemail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Summary|RX460 Memory clock stays |RX460 Memory clock stays |high until card is "used" |high until card / display | |is "used" Status|NEW |RESOLVED
https://bugs.freedesktop.org/show_bug.cgi?id=101528
Alexander Tsoy alexander@tsoy.me changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugzilla.kernel.org | |/show_bug.cgi?id=196615
dri-devel@lists.freedesktop.org